Concept of Algorithm • Computer Architecture • The Concept of Algorithm • Complexity and Order • NP-complete Problems Self Introduction Name: Takeaki Uno Affiliation: NII, Sokendai Age, status: 44, professor Research Area: algorithm, data mining, bioinformatics, operations research Recent Studies: efficient practical algorithms for basic operations and basic tasks on huge data in genome science, and data mining, etc. Goal / Evaluation / Reference • Algorithms are efficient especially for processing big data • The goal is to learn the skill and sense of developing algorithms, or viewing the problems and issues from algorithmic view points Evaluation: a report, on the end of the semester Reference: any textbook of title “Algorithm” and/or “Data structure”, that you feel interesting/understandable • e-mail: [email protected] Homepage: http://research.nii.ac.jp/~uno/index-j.html (the slides are uploaded) Contents of the Lecture Architecture of computer, and the program CPU, memory, OS, memory management, instruction, register Data structure (efficient ways of recording/using the data) stack, queue, heap, binary tree, hash Basic Design of Algorithms recursion, divide-and-conquer, enumeration, dynamic programming, sorting, string matching Graph Algorithms matching, shortest paths,… Connection to Latest Researches • Algorithms are connected to the many latest researches (algorithms are studied in the areas, in which algorithm are related) • Especially, having simple mathematical models and criteria, with huge data or complicated data + Bioinformatics Human genome is of 3 billion letters. They have only 4 kinds of letters ATGC, thus computers have advantages. Genes and variations often have some implicit rules + Chemoinformatics Chemical compounds are networks of atoms, so tractable for computers. However, the stereographical structures are difficult Connection to Latest Researches + Astronomy Computer systems of observatories generates huge amount of data, and those data is accumulated to integrated databases + Physics (quantum mechanics) data from particle accelerators are incredibly huge. Now, we can only filter and store the data, only + Social science recently, there have been progresses on understanding systems of societies by simulations with including micro activities Connection to Latest Researches + architecture other than structural calculations to evaluate the strength of the buildings, material optimizations and structure optimizations reduce the costs + literature large amount of literatures allows as to find interesting knowledge by computer operations + other development of simulation science, that is to understand the real world phenomenon and mechanisms by simulations Latest Researches in Japan • Japanese algorithmic researches are of high levels in the world + interior point methods for optimization problems + discrete mathematics and combinatorial optimizations + compression and succinct indexes + database search and database construction + data mining + computational geometry … • Studies on algorithms are currently on theory, mainly Engineering theorems would be derived in future Architecture of Computer Basic Architecture • The center of computer is “CPU (central processing unit) that basically does basic arithmetic operations • CPU is like an engine of a car Having CPU would be a definition of computer • Other than CPU, a computer has memory that records several (many) values (mostly 0 or 1) CPU memory Interfaces • Monitor, keyboard, mouse etc. are connected to CPU, and controlled/managed by signals given by CPU • From CPU, these signal receive/generation seems like memory; writing some values to a specified memory, signal will be sent • So, everything is operated by signals, especially 0/1 Program • CPU can do + read value of memory, write to memory + arithmetic operations such as addition and division + compare the values, and branch the operation • These are called instructions (values are called operands) • The order of instructions are written in a part of memory This is called program • Execution is to operate instructions according to the program CPU memory Execution of Program • CPU executes instructions written in the program, sequentially • Instructions are written in numbers, and each function is assigned to a number • During the execution, the program can change the memory place to read the program (jump) • Branching is done by this; jump or not according to the comparison (conditional jump) CPU memory プログラミング言語 • CPUの命令は数値。しかも、1つ1つの処理は非常に単純(これ をマシン語、あるいは機械語という) • これを組合せてプログラムを作るのは、かなり大変 • そこで、通常はもう少し抽象化して、人間に見やすくした「高級言 語」とよばれるものを使う (C,JAVA,Perl, Basic, シェルスクリプトなど) • 高級言語で書かれたプログラムを、CPUに実行させるには、プ ログラムをいったんマシン語に翻訳する(この作業をコンパイ ルという)か、プログラムどおりの処理をするプログラム(インタ ープリタという)を使う 概念的な例 1: 場所 15 に 0 を書き込む 2: 場所 16 に 2 を書き込む 3: 場所 15 と 16 の値を足し、それを場所 15 に書き込む 4: 場所 17 の値が 10 以上ならば、8へいく 5: 3へ行く 8: ... • 2-4のように局所的に繰り返して実行される部分をループ という CPU メモリ メモリのアクセス • メモリは1バイトずつ数値が記憶されている • メモリには、記憶場所に0から始まる番号がついていて、好きな 場所のデータに一手でアクセス(書き込み/読み出し)できる • このように、好きな場所に一手でアクセスできるメモリをランダ ムアクセスメモリという • ランダムアクセスでない記憶装置は、例えばCDとかハードディ スクとか、テープ 1 0 1 1 0 1 0 1 数値の表現(バイト) • メモリは数値を覚えられるが、実は 0 と 1 しか覚えられない (ビットという) • しかし、大量に 01 の数値を記憶できる • 実際は、01しか記憶できないのでは用をなさないので、01の 数値を8個セットにして、それを2進数とみなす。(0-255が表 現できる) - これを 1バイトという • 足す引くかけるの算術演算の結果、 2進数9桁目に繰り上が ったとき、あるいはマイナスになったときは、その部分は無 視して計算する 1 0 1 1 0 1 0 1 128+32+16+4+1 = 181 16進数 • コンピュータに記憶された数を扱うときは、当然、2進数で話 をしたほうがわかりやすい (例えば、1バイトに入る数の最大は 11111111 とか。直感的 になる) • 01 だけで表記すると場所をとる、桁が大きくなる • そこで、4桁ずつひとまとめにして、24 = 16 進数で表記する ことが多い • 10-15 は、abcdef で表記する。例えば、14=d、255=ff データの記憶 • メモリには、記録されている数値が整数か、文字か、といった、 データの種類を覚えておく機能はないので(数値として付加的 に記録することはできるが)、自分(プログラム)が何を書いた か管理する • つまり、「何のデータはどこに書いた」ということを決めておく。プ ログラムでは、データの計算をする際には、場所を決めて書く • 常に0か1の値が書いてあるので、「書き込まれていない」という ことも検出できない 点数 点数 20 点数 20 20 1 0 1 1 0 1 0 1 整数小数文字 - 整数: 01の数値をいくつかセットにして、それを2進数とみなす。 通常、32ビット = 4バイト。 (0-40億くらい) - 実数: 整数と小数点の位置をセットで記憶。小数点は2進数の 位置で記憶。通常、整数が56ビット、小数点が8ビット(256桁分) - 文字: 文字と整数値を対応させたコード表があり、それを使っ て整数値として記憶する 1 0 1 1 0 1 0 1 + 0 1 0 1 128+32+16+4+1 = 181 4+1 + (16+4+1)/32 = 5.65625 負の数の表現(バイト) - 負の数: 最上位のビットが1になった数は、256を引いて、 負の数とみなす。通常の数と思って計算したときと、負の 数と思って計算したときの結果が同じなので、結果を変換 する必要がない 例: 255 は-1 になる。1を足すと、 1111111 + 1 = 100000000。9桁目は無視するので、0 これは、実際の -1 +1 の答えと一致する 例: -2 は 254、-5 は 251。両者を足すと、505。256の剰余を 求めると249。これは-7に対応。両者を掛けると 63754。剰 余を求めると 10。 文字コード表 (ASCII) - 0-255の整数と文字の1対1対応を表したコードをASCIIコード、 そのコードに基づいて書かれた文字をASCII文字という (だい たい世界標準) - コンピュータに文字が記憶されるときは、このテーブルにした がって数値に変換され、1文字1バイトで記録される(大小の比 較ができる) | 20 sp | 21 ! | 22 " | 23 # | 24 $ | 25 % | 26 & | 27 ' | | 28 ( | 29 ) | 2a * | 2b + | 2c , | 2d - | 2e . | 2f / | | 30 0 | 31 1 | 32 2 | 33 3 | 34 4 | 35 5 | 36 6 | 37 7 | | 38 8 | 39 9 | 3a : | 3b ; | 3c < | 3d = | 3e > | 3f ? | | 40 @ | 41 A | 42 B | 43 C | 44 D | 45 E | 46 F | 47 G | | 48 H | 49 I | 4a J | 4b K | 4c L | 4d M | 4e N | 4f O | | 50 P | 51 Q | 52 R | 53 S | 54 T | 55 U | 56 V | 57 W | | 58 X | 59 Y | 5a Z | 5b [ | 5c \ | 5d ] | 5e ^ | 5f _ | | 60 ` | 61 a | 62 b | 63 c | 64 d | 65 e | 66 f | 67 g | 日本語の文字コード • 日本語については、日本語独自のコード表がある(世界規格と して使われているので、外国でも日本語の文章が読める) • 歴史的な事情で、いくつかのコードがある - JIS コード - shift JIS コード - EUC コード - unicode JISコード • 英語は0-127しか定められていないので、128-255の部分にかな と記号が入っている JIS漢字コード • 漢字については、制御コードという、特別の文字2つが入ったら、 日本語のモードになる。漢字は、2バイト1文字。もう一度制御コ ードが来たら、以後は英数字に戻る 16区(S-JIS第一バイト=0x88) 0 1 9 A 唖 娃 B 芦 鯵 C 安 庵 D 威 尉 2 3 4 5 6 7 8 9 A B C D E F 亜 阿 哀 愛 挨 姶 逢 葵 茜 穐 悪 握 渥 旭 葦 梓 圧 斡 扱 宛 姐 虻 飴 絢 綾 鮎 或 粟 袷 按 暗 案 闇 鞍 杏 以 伊 位 依 偉 囲 夷 委 惟 意 慰 易 椅 為 畏 異 移 維 緯 胃 萎 衣 E 謂 違 遺 医 井 亥 域 育 郁 磯 一 壱 溢 逸 稲 茨 F 芋 鰯 允 印 咽 員 因 姻 引 飲 淫 胤 蔭 Shift-JISコード • Windows で使われているコード • 制御文字を使わず、2バイトの文字をあらわしている • 1文字目が記号だと、次の 1バイトとあわせて記号だと みなすので、記号が使えな くなる • 2バイト目は、改行コード など、通常の文字になること もあるので、日本語対応して いないソフトで見ると、 ぐちゃぐちゃになる EUCコード • 主にUNIX で使われているコード • 制御文字を使わず、2バイトの文字をあらわしている • 1文字目が記号、かななら、 2文字目とあわせて、 漢字コードとみなす • 2文字目も、かな記号なの で、漢字の一部ならかな記号、 と判断できる • 記号、かなが出せない UNICODE • 世界の全ての言語の文字を1つのコードで表す、というもの • 2バイト1文字を使わず、2バイトの文字をあらわしている • 漢字など、地域で字体が 違うものも1つとして扱うため 事実上はその国固有の フォントが必要 • 最近は4バイトで表すようだ キャッシュメモリ • コンピュータのメモリのアクセス速度は、演算速度よりかなり遅い (10倍以上) • そのため、メモリを読み出すときは、しばらく演算をとめて待つこ とになる。次の命令を読むときもそう。 • そこで、メモリを読むときはまとめてたくさん読む • 読んだメモリは、CPUに直結した速いメモリにしばらく取っておく (この操作をキャッシュ、キャッシュに使うメモリをキャッシュメモ リという) キャッシュ CPU メモリ メモリ ディスク キャッシュによる高速化 • メモリアクセスが、キャッシュに入っている場所ばかりだと計算は 大幅に速くなる キャッシュの効率を高めるような保存法が重要 引き続いて、キャッシュの中を見ることが多い計算をさせると、 高速化できる • ディスクのアクセスにも、同じようにキャッシュを使う CPU直結メモリの代わりに、通常のメモリを使う。メモリのほうが ディスクよりアクセス速度が速いので、高速化できる CPU メモリ HDD 変数 • 高級言語では、数値を記憶する際に変数というものを使う • 変数には、何か値が1つ入る • コンパイルして実行するときに、各変数にメモリの場所が割当て られる。以後、変数をアクセスするときには、その場所にアクセ スするようになる 配列 • 大量のデータを扱う場合、全てのデータに直接変数を割当てる のは、大変。プログラムを書くのも大変 • そこで、配列を使う • 配列を使うと、変数+添え字、という形でデータにアクセスできる。 添え字のところは変数を入れられるので、例えば、100個の変 数を0にする、といった作業も、ループを使って楽にできる 1 0 1 1 0 1 0 1 OS • コンピュータは、機種によって、入出力の方法が違う • ディスプレイに文字を書くためには、文字の形になるよう、データ を書き込まなければいけない • こういった、ハードウェアによる違いを吸収する、低レベルの処 理を行う、実行するプログラムの管理、などをするプログラム がある • これをOSという (Windows、UNIXなど) • 固有の接続機器を、標準的な入出力方法を用いて扱うプログラ ムをデバイスドライバという • メモリの管理もOSが行う メモリの確保 • 普通、コンピュータでは、複数のプログラムが同時に実行されて いる(含むデバイスドライバ) • そのため、メモリの適当な場所に適当に数値を書き込むと、他の プログラムの実行を阻害する(下手をすると動きが止まる) • そのため、メモリを使いたいときには、OSにお願いして、必要な 分だけ使える場所をあてがってもらう(これをメモリを確保すると いう) Complexity and Order Ways to Solve • ”How to solve” and “The solution” are different + Stew and the way to cook stew + The way to solve a puzzle ring + The way to stack the tower of Hanoi + The solution to a quadratic equation x2 - 2x + 1 = 0 • ”Ways to solve” is more abstract than solution; solution is a solution to just an instance • Consider ways to solution Computer Algorithm Algorithm: set (order) of instructions to do a specified task Usually, algorithm is that for computers Algorithm can be considered as a theory of designing the way of computing, or theory of efficient programs Summation from 1 to 100: • 1+2+,...,+100 ‥‥ 99 additions • (1+100) * (100/2) ‥ 3 operations Cut a carrot into slices of star shapes • make slices, then cut the edges of each slice • make a star shaped stick, then slice it Algorithms in Your Life • how to cook • calculation with figures written down on paper • car driving skill • employment interviews in companies … (all are not complete …) Evaluation Criteria • How do we evaluate algorithms? • How simple simplicity of programs • Efficiency (speed, place, cost,…) speed and memory usage (electric power consumption) • Simplicity is difficult to evaluate, but speed and memory usage can be measured by values Evaluation of Time • It is easy to evaluate the speed of a program; just get the duration of execution • Of course, the quality of the writing of the program affects this • But, this is the evaluation of “programmer” not that of “algorithm” Even though the algorithms are the same, the skill of programmers makes the difference • We want to have a good model of measuring the efficiency of algorithms Turing Machine • Making model of algorithms means making model of computers • We need to have quite basic one • The base of computer is CPU and memory The Turing machine is such a model • A Turing machine is composed of a “tape” that records sequence of 01 data, and a “head” that reads/writes 01 data, and an execution unit managing the movement of tape • A Turing machine can read/write data, move the tape, and change the internal state of the execution unit A Model for Computation Time • The time of Turing machine is evaluated by the number of read/write operations • Basically, any operation of a computer can be simulated with several operations of Turing machine the number of operations is independent from the time/size of each operation • So, both can be evaluated by the same model the number of operations executed by an algorithm will be a good measure Abstracted Evaluation • The idea of “the number of basic operations” is good. It is much abstracted. However, still the skill of programmer affects • We want to have more abstracted one that is independent from that • Anyway, what kind of issues do we want to evaluate? • For example, measure the efficiency on the execution of a specified task? for a benchmark problem? then, basically the skill of programming still affects Evaluation for Input Size • Computation needs some inputs The size of input increases, then computation time will be long • Observe the increase of time against the increase of size we will get a function representing this relation • For the input size n (#bits, in exact), we represent the number of the necessary operations such as 10n2+5n+2 • This varies among several inputs, use average? to assure the time, we use the maximum Ignoring the Coefficients • We want to know the increase of operations it depends on the largest degree, mainly; 10n2+5n+2 we focus on 10n2 , and ignore other coefficents • Programming skills makes simple operations more simpler affects to coefficients such as 10n2+5n+2 we focus only on n2, and ignore all constant factors • This way of evaluating the cost of computation is called “evaluation by order” 数学的な定義 • ”A function f(n) is O(g(n))” is defined by lim n+∞ f(n) g(n) < +∞ • We say that the order of an algorithm is O(g(n)) if the function bounding the maximum number of basic operations for any input of size n is of O(g(n)) • The order of computation time is called time complexity, or simply complexity Summary of Order Order of computation time: O(f(n)) a (polynomial/exponential) function representing an upper bound of the computation time in the term of input sizes + coeeficients are ignored since it represents the programming skill + we focus only on the maximum magunitude (depends only on the maximum degree for larger inputs) Find a word in a dictionary of n items • linear search O(n) • binary search O(log n) Sorting n numbers • insertion, quick sort O(n2) • merge sort, heap sort O(n log n) Small order algorithms are so efficient even though the computer and/or programmer is not good Advantage / Disadvantage • When the problem is big, acceleration by algorithmic theory is drastic • When the problem is small, practical performance is bad due to sophisticated construction of operations • Can not represent the average and practical performance size of million size of 100 2-3 times 10,000 times Evaluation of Memory Usage • The efficiency of memory usage can also be evaluated by the order and complexity + the order represents the worst usage of memory amount for the input size + the complexity is called “space complexity” • We can also ignore the skill of programming • For the memory usage, evaluation of the worst case is more acceptable since the algorithm stops if the memory is short Fundamentals of Complexity Theory Good Design • When we can decrease the order of an algorithm, we could “improve” the design of the algorithm • …, then, some question arise; which algorithm is good/bad? Where is the boundary • An idea is to compare the most naïve algorithm • So, observe the most naïve algorithm Naïve Algorithm • We here say non-well designed algorithm naïve algorithm • Suppose that here naïve means that to explorer all the possibilities + find all numbers from a1,…,an that are less than b examine all subsets of a1,…,an, and output the one that partitions the set in the way of the statement + sorting examine all possible orders, and output the one which is increasing order + find the longest decreasing subsequence of a1,…,an examine all combinations of a1,…,an, and output the longest among all decreasing sequences Time of Naïve Algorithms + find all numbers from a1,…,an that are less than b + find the longest decreasing subsequence of a1,…,an computation time is O(n2n) + sorting computation time is O(n n!) ≒ O(n2n logn) • The time spent by these algorithms, that spends time exponential in the input sizes, is called exponential time, and these algorithms are called exponential time algorithms In “Good” Ways + find all numbers from a1,…,an that are less than b scan the numbers, and output those less than b computation time is O(n) + find the longest decreasing subsequence of a1,…,an for each ai, find the longest decreasing subsequence ending at ai 。 Find the longest among those of all ai’s computation time is O(n2) + sorting scan the sequence and swap the pairs of neighboring numbers with the reverse ordering; repeat this n times computation time is O(n2) “Good” Algorithms computation time is O(n) computation time is O(n2) computation time is O(n2) • They are not exponential • These times that are polynomial in the input sizes are called polynomial time, and the algorithms are called polynomial time algorithms • Any polynomial, even with large degrees, is always smaller than exponential, for sufficiently large n, thus essentially different hence, polynomial is good, in some sense Difficulty of Problems • Good algorithms solve problems in short time • So, the existence of good algorithm means that the problem is easy to solve (no need of testing all possibilities) the difficulty indicates the cost of developing algorithms polynomial time means that the problem would be easy no polynomial time is found means that the problem would be difficult • … then, we can compare the difficulties of two problems Exact Comparison • We can compare the difficulties of the problems, by comparing the orders of the algorithms for the problems • However, in exact, it is not sufficient faster algorithms would exist • So, consider special cases in which we can surely compare, although it is perfectly general Problem Reduction • Suppose that there are two types of problems A and B, and any instance of B can be transformed to that of A, or any instance of B can be solved by solving an instance of A plus α + find the kth largest value from n numbers after sorting the numbers, we can find it in O(n) time this problem can be solved by sorting • Transforming an instance of B to that of A, or solving an instance of B by solving that of A is said to “reduce problem B to A” • In this case, A is more difficult, if time is no less than +α • Input/output always needs O(n) time, thus finding kth largest value is no harder than sorting Difficult Problems • Easy problems are stated by showing fast algorithms • How to show the difficulty? • Hard to state that “any algorithm needs exponential time” • So, find most difficult problems among the problems having polynomial/exponential time algorithms use the way to compare the difficulties; state that any problems can be reduced to this problem in polynomial time (hardest in polynomial time/exponential time problems • Are there any such “useful problems”? Satisfiability Problem • There is a such “useful” problem, called satisfiability problem (SAT) • SAT is to check whether there is an assignment to Boolean variables x1,…,xn s.t. they satisfy the given formula we can assume that the formula is a CNF • The trick is, any computer circuit is represented by a formula • A computer circuit (of an exponential size) can simulate a nondeterministic Turing machine that can examine exponentially many possibilities at once in parallel • Difficult combinatorial problems can be solved in polynomial time • SAT is most difficult among them (problem seem to need exponential time), since all the others can be reduced to SAT Certification of Difficulty • SAT is a kind of a landmark of difficult problems a problem is “difficult” if it is no easy than SAT • Such a problem is called NP-hard (SAT is NP-hard) • Problems that can be solved in polynomial time by nondeterministic Turing machine are called NP problems (SAT is NP) the certification of “yes” answer is of polynomial size • An NP problem that is NP-hard is called NP-complete • The problem class of NP-complete problems is the one of composed of problems of most difficult among all NP problems SAT 3SAT • SAT is NP-complete even if any clause has exactly three literals SAT is reducible to this 3SAT problem Reduction: for each clause of a SAT instance, + if size < 3, generate 2 or 4 clauses by appending slack literals (x) (x∨y∨z) ∧ (x∨y∨¬z) ∧ (x∨¬y∨z) ∧ (x∨¬y∨¬z) + if size > 3, prepare a slack for each literal, and joint them by using slack literals (a∨b∨c∨d∨e∨f) (a∨b∨x)∧ (¬x∨c∨y)∧ (¬y∨d∨z)∧ (¬z∨e∨f) SAT k-Clique • A clique is a subgraph of a graph in which any pair of vertices are connected by an edge SAT is reducible to determine whether a graph includes a clique of k vertices • Prepare a vertex for each literal of each clause • Draw edges between all pairs of the literals, but not if one is the negation of the other • There exists clique of size (#clauses) we can choose literals of each clauses with no conflict (satisfiable!) Independent Set Problem Independent set: vertex subset of a graph G s.t. no two its vertices are connected by an edge • k-independent set problem is to determine whether the given graph has an independent set of size k • The problem is an essence of exclusion constraints (composed only of exclusion) • How do we prove NP-completeness? • Reduce SAT Simulate Selection by Exclusion • We can simply simulate the constraint in SAT that is “we can choose exactly one of xi or ¬xi, for each i prepare vertices of xi and ¬xi, and make edge between them an n independent set from these pairs is an assignment • Next issue is satisfiability for clauses xi ¬xi • We want to simulate, like, “we can choose a vertex corresponding to a clause, if one of the literal is in the assignment” how do we do this by making graph? Representing Clauses • We first, simply, connect a clause Ci and its literals • If there is a literal in the assignment, we can not choose the clause Ci • If Ci is not satisfied, we can choose Ci, ( not have to choose) x4 Ci x1 ¬x3 • … This is the opposite of what we want to do if everything is reversed, just give size constraint, and done Reversing the Selection • Let’s consider the reverse we consider the complement of the assignment, so we choose the negation of the assignment • When a literal of clause Ci is chosen Ci is disabled x4 Ci • When no literal of Ci is chosen Ci is disabled • Now, we can not choose Ci in both cases (failed!!!) • One more idea is needed x1 ¬x3 Transforming Clauses • Since we want choose a clause if one of its literals is chosen, we prepare vertices for each literal • We choose one of these vertices, thus we make them a clique • … then, we can choose one of clause vertices when one of its literals is chosen Ci x4 Ci x1 Ci ¬x3 Summary of Reduction • Reduce an instance of SAT Input: variable x1,…,xn, clauses C1,…,Cm composed of xi,¬xi’s • Construct the following graph vertex set: literals xi,¬xi, pairs (Ci, xj) of clause Ci and its literal xj edge set: {xi,¬xi} for any i, {(Ck, xi), xi} for any Ck and xi • The graph has an independent set of size n+m the SAT instance has true assignment Ci x4 Ci x1 Ci ¬x3 How Much Can We Do Better? • Exponential time can be reduced to polynomial time, by deriving good algorithm low degree polynomials are better • We of course try to derive algorithms with small degrees, but where is the limit? there should be some explicit limits (called lower bound) • For example, O(n) time for inputting the problem is a trivial lower bound Choose a Model • For establish lower bounds we need model of computation • For example, consider the problem of finding the maximum among n numbers • In usual, we need n time to input, thus it would be a lower bound, but it is not true for parallel computation • Extremely, humans can double their number in constant time, thus after increasing to n humans in O(log n) time, then we can choose the maximum in O(log n) time Unit of Operation • Memory read/write and conditional jump are basic operation of computer, thus they should be units of operations • Under this, let’s think about a lower bound of minimum number of operations that are clearly needed to solve the problem Search: find the one from n numbers that is maximum, minimum or nearest to the given k Sort: re-order the sequence of n numbers in increasing order The Worst Case • There would be an observation: “conditional jump branch the execution to two, thus if a problem has N kinds of possible solutions, it has at least N branches in the computation tree” • Even if we balance the computation tree, the worst case has to pass through at least log2 N conditional jumps Search: find the one from n numbers that is maximum… n possible solutions, thus we need log2 n time Sort: re-order the sequence of n numbers in increasing order n! possible solutions, thus we need log2 n! = nlog n time Summary • Fundamentals of compute architecture CPU, memory, + I/O. Hierarchy of storage • Fundamentals of algorithm a sequence of operations with conditional branches • Definition of order ignoring constant factors and small degrees, to capture the global increase by the problem sizes • Fundamentals of complexity: polynomiality, NP-complete, lower bounds
© Copyright 2024 ExpyDoc