Qu’est-ce qu’une explication simple en anglais de la notation «Big O»?

Je préférerais aussi peu de définition formelle que possible et de simples mathématiques.

Note rapide, ceci est presque certainement déroutant avec la notation Big O (qui est une limite supérieure) avec la notation Thêta (qui est une limite à deux côtés). Dans mon expérience, cela est en fait typique des discussions dans des contextes non académiques. Toutes mes excuses pour toute confusion provoquée.


Big O complexité peut être visualisé avec ce graphique:

Big O Analysis

La définition la plus simple que je puisse donner pour la notation Big-O est la suivante:

La notation Big-O est une représentation relative de la complexité d’un algorithme.

Il y a quelques mots importants et délibérément choisis dans cette phrase:

  • relative: vous ne pouvez comparer des pommes que des pommes. Vous ne pouvez pas comparer un algorithme pour effectuer une multiplication arithmétique à un algorithme qui sortinge une liste d’entiers. Mais une comparaison de deux algorithmes pour effectuer des opérations arithmétiques (une multiplication, une addition) vous dira quelque chose de significatif;
  • representation: Big-O (dans sa forme la plus simple) réduit la comparaison entre les algorithmes à une seule variable. Cette variable est choisie en fonction d’observations ou d’hypothèses. Par exemple, les algorithmes de sorting sont généralement comparés en fonction d’opérations de comparaison (en comparant deux nœuds pour déterminer leur classement relatif). Cela suppose que la comparaison est coûteuse. Mais que se passe-t-il si la comparaison est bon marché mais que l’échange est coûteux? Cela change la comparaison; et
  • la complexité: si cela me prend une seconde pour sortinger 10 000 éléments, combien de temps cela va me prendre pour sortinger un million? La complexité dans ce cas est une mesure relative à autre chose.

Revenez et relisez le texte ci-dessus lorsque vous avez lu le rest.

Le meilleur exemple de Big-O auquel je peux penser est l’arithmétique. Prenez deux chiffres (123456 et 789012). Les opérations arithmétiques de base que nous avons apsockets à l’école étaient les suivantes:

  • une addition;
  • soustraction;
  • multiplication; et
  • division.

Chacun d’entre eux est une opération ou un problème. Une méthode pour les résoudre s’appelle un algorithme .

L’ajout est le plus simple. Vous alignez les chiffres (à droite) et ajoutez les chiffres dans une colonne en écrivant le dernier numéro de cette addition dans le résultat. La partie «dizaines» de ce nombre est rescope à la colonne suivante.

Supposons que l’ajout de ces nombres est l’opération la plus coûteuse de cet algorithme. Il va de soi que pour additionner ces deux nombres, il faut append 6 chiffres (et éventuellement un 7). Si nous ajoutons deux nombres de 100 chiffres, nous devons faire 100 ajouts. Si nous ajoutons deux nombres à 10 000 chiffres, nous devons faire 10 000 ajouts.

Voir le modèle? La complexité (étant le nombre d’opérations) est directement proportionnelle au nombre de chiffres n dans le plus grand nombre. Nous appelons cette complexité O (n) ou linéaire .

La soustraction est similaire (sauf que vous devrez peut-être emprunter au lieu de transporter).

La multiplication est différente. Vous alignez les chiffres, prenez le premier chiffre du nombre inférieur et multipliez-le par rapport à chaque chiffre du numéro supérieur et ainsi de suite à chaque chiffre. Donc, pour multiplier nos deux nombres à 6 chiffres, nous devons faire 36 multiplications. Nous devrons peut-être append 10 ou 11 colonnes pour obtenir le résultat final.

Si nous avons deux nombres à 100 chiffres, nous devons faire 10 000 multiplications et 200 append. Pour deux nombres d’un million de chiffres, nous devons faire un billion (10 12 ) de multiplications et deux millions d’ajouts.

Au fur et à mesure que l’algorithme évolue avec n- carré , il s’agit de la complexité O (n 2 ) ou quadratique . C’est un bon moment pour introduire un autre concept important:

Nous nous intéressons uniquement à la partie la plus importante de la complexité.

Les astucieux ont peut-être réalisé que l’on pouvait exprimer le nombre d’opérations comme: n 2 + 2n. Mais comme vous l’avez vu dans notre exemple avec deux chiffres d’un million de chiffres chacun, le deuxième terme (2n) devient insignifiant (représentant 0,0002% du total des opérations à ce stade).

On peut remarquer que nous avons assumé le pire scénario ici. En multipliant les nombres à 6 chiffres si l’un d’eux est à 4 chiffres et l’autre à 6 chiffres, alors nous n’avons que 24 multiplications. Cependant, nous calculons le pire scénario pour ce «n», c’est-à-dire lorsque les deux sont des nombres à 6 chiffres. La notation Big-O concerne donc le scénario le plus défavorable d’un algorithme

L’annuaire téléphonique

Le second meilleur exemple auquel je puisse penser est l’annuaire téléphonique, normalement appelé les pages blanches ou similaires, mais il varie d’un pays à l’autre. Mais je parle de celle qui répertorie les personnes par nom de famille, puis par initiales ou par prénom, éventuellement par adresse et ensuite par numéros de téléphone.

Maintenant, si vous demandiez à un ordinateur de rechercher le numéro de téléphone de “John Smith” dans un annuaire téléphonique contenant 1 000 000 de noms, que feriez-vous? En ignorant le fait que vous pouviez deviner jusqu’où le S a commencé (supposons que vous ne pouvez pas), que feriez-vous?

Une implémentation typique pourrait être d’ouvrir le milieu, de prendre le 500 000 e et de le comparer à “Smith”. S’il s’agit de “Smith, John”, nous avons eu de la chance. Beaucoup plus probable est que “John Smith” sera avant ou après ce nom. Si c’est après nous divisons ensuite la dernière moitié de l’annuaire en deux et répétez. Si c’est avant, nous divisons la première moitié de l’annuaire en deux et répétons. Etc.

Ceci est appelé une recherche binary et est utilisé chaque jour en programmation, que vous le réalisiez ou non.

Donc, si vous voulez trouver un nom dans un annuaire de millions de noms, vous pouvez trouver n’importe quel nom en faisant cela au plus 20 fois. En comparant les algorithmes de recherche, nous décidons que cette comparaison est notre «n».

  • Pour un annuaire de 3 noms, il faut 2 comparaisons (au maximum).
  • Pour 7 cela prend au maximum 3.
  • Pour 15 il faut 4.
  • Pour 1 000 000, il en faut 20.

C’est incroyablement bon n’est ce pas?

En termes de Big-O, ceci est la complexité O (log n) ou logarithmique . Le logarithme en question pourrait être ln (base e), log 10 , log 2 ou une autre base. Peu importe, il rest que O (log n), tout comme O (2n 2 ) et O (100n 2 ) sont toujours les deux O (n 2 ).

Cela vaut la peine d’expliquer que Big O peut être utilisé pour déterminer trois cas avec un algorithme:

  • Meilleur cas: Dans la recherche dans l’annuaire téléphonique, le meilleur exemple est que nous trouvons le nom dans une comparaison. Ceci est O (1) ou complexité constante ;
  • Cas attendu: Comme discuté ci-dessus, il s’agit de O (log n); et
  • Le pire cas: c’est aussi O (log n).

Normalement, nous ne nous soucions pas du meilleur cas. Nous sums intéressés par le cas attendu et le pire. Parfois, l’un ou l’autre d’entre eux sera plus important.

Retour à l’annuaire téléphonique.

Que faire si vous avez un numéro de téléphone et que vous souhaitez trouver un nom? La police a un annuaire téléphonique inversé, mais ces recherches sont refusées au grand public. Ou sont-ils? Techniquement, vous pouvez inverser la recherche d’un numéro dans un annuaire téléphonique ordinaire. Comment?

Vous commencez par le prénom et comparez le numéro. Si c’est un match, génial, sinon, vous passez à la suivante. Vous devez le faire de cette façon car le répertoire téléphonique n’est pas numéroté (par numéro de téléphone de toute façon).

Donc, pour trouver un nom donné le numéro de téléphone (recherche inversée):

  • Meilleur cas: O (1);
  • Cas attendu: O (n) (pour 500 000); et
  • Pire cas: O (n) (pour 1 000 000).

Le vendeur itinérant

C’est un problème assez connu en informatique et mérite une mention. Dans ce problème, vous avez N villes. Chacune de ces villes est reliée à une ou plusieurs autres villes par une route d’une certaine distance. Le problème du voyageur itinérant est de trouver le tour le plus court qui visite chaque ville.

Cela semble simple? Pensez à nouveau.

Si vous avez 3 villes A, B et C avec des routes entre toutes les paires, vous pouvez aller:

  • A → B → C
  • A → C → B
  • B → C → A
  • B → A → C
  • C → A → B
  • C → B → A

Eh bien, en réalité, il y en a moins car certains d’entre eux sont équivalents (A → B → C et C → B → A sont équivalents, par exemple, car ils utilisent les mêmes routes, juste en sens inverse).

En réalité, il y a 3 possibilités.

  • Prenez ceci à 4 villes et vous avez (iirc) 12 possibilités.
  • Avec 5, c’est 60.
  • 6 devient 360.

Ceci est une fonction d’une opération mathématique appelée factorielle . Fondamentalement:

  • 5! = 5 × 4 × 3 × 2 × 1 = 120
  • 6! = 6 × 5 × 4 × 3 × 2 × 1 = 720
  • 7! = 7 × 6 × 5 × 4 × 3 × 2 × 1 = 5040
  • 25! = 25 × 24 ×… × 2 × 1 = 15,511,210,043,330,985,984,000,000
  • 50! = 50 × 49 ×… × 2 × 1 = 3,04140932 × 10 64

Le problème du Big-O du voyageur voyageur est donc O (n!) Ou la complexité factorielle ou combinatoire .

Au moment où vous arrivez à 200 villes, il ne rest pas assez de temps dans l’univers pour résoudre le problème avec les ordinateurs traditionnels.

Quelque chose à quoi penser.

Temps polynomial

Un autre point que je voulais mentionner rapidement est que tout algorithme ayant une complexité de O (n a ) est dit avoir une complexité polynomiale ou peut être résolu en temps polynomial .

O (n), O (n 2 ) etc. sont tous des temps polynomiaux. Certains problèmes ne peuvent pas être résolus en temps polynomial. Certaines choses sont utilisées dans le monde à cause de cela. La cryptographie à clé publique est un excellent exemple. Il est difficile de trouver deux facteurs principaux d’un très grand nombre. Si ce n’était pas le cas, nous ne pourrions pas utiliser les systèmes de clés publiques que nous utilisons.

Quoi qu’il en soit, c’est tout pour l’explication de Big O (révisée).

Il montre comment un algorithme évolue.

O (n 2 ) : connu sous le nom de complexité quadratique

  • 1 article: 1 seconde
  • 10 éléments: 100 secondes
  • 100 articles: 10000 secondes

Notez que le nombre d’éléments augmente d’un facteur 10, mais que le temps augmente d’un facteur 10 2 . Fondamentalement, n = 10 et ainsi de suite O (n 2 ) nous donne le facteur d’échelle n 2 qui est 10 2 .

O (n) : connu sous le nom de complexité linéaire

  • 1 article: 1 seconde
  • 10 éléments: 10 secondes
  • 100 éléments: 100 secondes

Cette fois, le nombre d’éléments augmente d’un facteur de 10, tout comme le temps. n = 10 et donc le facteur d’échelle de O (n) est 10.

O (1) : connu sous le nom de complexité constante

  • 1 article: 1 seconde
  • 10 articles: 1 seconde
  • 100 articles: 1 seconde

Le nombre d’éléments augmente encore d’un facteur 10, mais le facteur d’échelle de O (1) est toujours 1.

O (log n) : connu sous le nom de complexité logarithmique

  • 1 article: 1 seconde
  • 10 éléments: 2 secondes
  • 100 éléments: 3 secondes
  • 1000 éléments: 4 secondes
  • 10000 éléments: 5 secondes

Le nombre de calculs n’est augmenté que par un journal de la valeur d’entrée. Donc, dans ce cas, en supposant que chaque calcul dure 1 seconde, le journal de l’entrée n est le temps requirejs, donc log n .

C’est l’essentiel de cela. Ils réduisent les calculs pour ne pas être exactement n 2 ou ce qu’ils disent, mais ce sera le facteur dominant dans la mise à l’échelle.

La notation Big-O (également appelée notation “croissance asymptotique”) est ce que les fonctions “ressemblent” lorsque vous ignorez les facteurs constants et les éléments proches de l’origine . Nous l’utilisons pour parler de la façon dont les choses évoluent .


Les bases

pour “suffisamment” de grandes entrées …

  • f(x) ∈ O(upperbound) signifie que f “ne pousse pas plus vite que”
  • f(x) ∈ Ɵ(justlikethis) signifie que f “croît exactement comme” justlikethis
  • f(x) ∈ Ω(lowerbound) signifie que f “ne lowerbound pas plus que” en lowerbound

La notation big-O ne se soucie pas des facteurs constants: la fonction 9x² est dite “croître exactement comme 10x² . La notation asymptotique des big-O ne se préoccupe pas non plus des choses non asymptotiques (“choses proches de l’origine” ou “que se passe-t-il lorsque la taille du problème est petite”): la fonction 10x² est dite “exactement identique” 10x² - x + 2 .

Pourquoi voudriez-vous ignorer les plus petites parties de l’équation? Parce qu’ils deviennent complètement éclipsés par les grandes parties de l’équation, à mesure que l’on considère des échelles de plus en plus grandes; leur consortingbution devient de plus en plus nue et inutile. (Voir la section exemple.)

Autrement dit, tout est question de rapport à l’infini. Si vous divisez le temps réel pris par l’ O(...) , vous obtiendrez un facteur constant dans la limite des grandes entrées. Intuitivement, cela a du sens: les fonctions “évoluent” les unes les autres si vous pouvez en multiplier une pour obtenir l’autre. C’est à dire quand on dit …

 actualAlgorithmTime(N) ∈ O(bound(N)) eg "time to mergesort N elements is O(N log(N))" 

… cela signifie que pour des tailles de problèmes “assez grandes” N (si nous ignorons des choses proches de l’origine), il existe une constante (par exemple 2,5, complètement constituée) telle que:

 actualAlgorithmTime(N) eg "mergesort_duration(N) " ────────────────────── < constant ───────────────────── < 2.5 bound(N) N log(N) 

Il y a beaucoup de choix de constante; souvent, le "meilleur" choix est connu sous le nom de "facteur constant" de l'algorithme ... mais nous l'ignorons souvent comme nous ignorons les termes non plus grands (voir la section Facteurs constants pour savoir pourquoi ils ne sont généralement pas importants). Vous pouvez également considérer l’équation ci-dessus comme une limite, en disant: « Dans le pire des cas, le temps nécessaire ne sera jamais inférieur à environ N*log(N) , avec un facteur constant de 2,5. Je m'inquiète beaucoup) ".

En général, O(...) est le plus utile car on se préoccupe souvent du comportement le plus défavorable. Si f(x) représente quelque chose de "mauvais" comme l'utilisation du processeur ou de la mémoire, alors " f(x) ∈ O(upperbound) " signifie "la upperbound est le pire scénario d'utilisation du processeur / de la mémoire".


Applications

En tant que construction purement mathématique, la notation big-O ne se limite pas à parler du temps de traitement et de la mémoire. Vous pouvez l'utiliser pour discuter de l'asymptote de tout ce qui a un sens, comme par exemple:

  • le nombre de poignées de main éventuelles parmi N personnes lors d'une fête ( Ɵ(N²) , en particulier N(N-1)/2 , mais ce qui compte, c'est que cela "ressemble à" )
  • nombre probabiliste attendu de personnes ayant vu du marketing viral en fonction du temps
  • comment la latence du site Web évolue avec le nombre d'unités de traitement dans un CPU, un GPU ou un cluster d'ordinateur
  • comment la puissance calorifique diminue sur les masortingces de CPU en fonction du nombre de transistors, de la tension, etc.
  • combien de temps un algorithme doit exécuter, en fonction de la taille d'entrée
  • combien d'espace un algorithme doit exécuter, en fonction de la taille d'entrée

Exemple

Pour l'exemple de poignée de main ci-dessus, tout le monde dans une pièce serre la main de tout le monde. Dans cet exemple, #handshakes ∈ Ɵ(N²) . Pourquoi?

Sauvegardez un peu: le nombre de poignées de main est exactement n-choix-2 ou N*(N-1)/2 (chacune des N personnes secoue les mains de N-1 autres personnes, mais cela double les poignées, divisez donc par 2):

tout le monde se serre la main à tout le monde. Crédit d'image et licence par article de wikipedia / wikimedia commons matrice d'adjacence

Cependant, pour un très grand nombre de personnes, le terme linéaire N est réduit à néant et consortingbue effectivement à 0 au ratio (dans le graphique: la fraction des cases vides sur la diagonale par rapport au nombre total de cases diminue avec le nombre de participants). Le comportement de mise à l'échelle est donc d' order N² , ou le nombre de sockets de contact "augmente comme N²".

 #handshakes(N) ────────────── ≈ 1/2 N² 

C'est comme si les cases vides sur la diagonale du graphique (N * (N-1) / 2 coches) n'étaient même pas là (N 2 coches asymptotiquement).

(digression temporaire de "plain English" 🙂 Si vous vouliez vous le prouver, vous pourriez effectuer une algèbre simple sur le ratio pour le diviser en plusieurs termes ( lim signifie "considéré dans la limite de", ignorez-le simplement si vous ne l'avez pas vu, c'est juste la notation pour "et N est vraiment très gros"):

  N²/2 - N/2 (N²)/2 N/2 1/2 lim ────────── = lim ( ────── - ─── ) = lim ─── = 1/2 N→∞ N² N→∞ N² N² N→∞ 1 ┕━━━┙ this is 0 in the limit of N→∞: graph it, or plug in a really large number for N 

tl; dr: Le nombre de poignées de main ressemble beaucoup à 'x² pour les grandes valeurs, que si nous écrivions le ratio # handshakes / x², le fait que nous n'ayons pas besoin de poignées x² dans la décimale pour un temps arbitrairement grand.

par exemple pour x = 1million, ratio # handshakes / x²: 0.499999 ...


Intuition du bâtiment

Cela nous permet de faire des déclarations comme ...

"Pour assez grand inputize = N, quel que soit le facteur constant, si je double la taille en entrée ...

  • ... Je double le temps qu'un algorithme O (N) ("temps linéaire") prend. "

    N → (2N) = 2 ( N )

  • ... je double au carré (quadruple) le temps que prend un algorithme O (N²) ("temps quadratique"). " (Par exemple, un problème 100x comme grand prend 100² = 10000x aussi long ... peut-être insoutenable)

    → (2N) ² = 4 ( )

  • ... J'ai doublé (octuple) le temps que prend un algorithme O (N³) ("temps cubique"). " (Par exemple, un problème 100 fois plus gros prend 100³ = 1000000x aussi long ... très insoutenable)

    cN³ → c (2N) ³ = 8 ( cN³ )

  • ... J'ajoute un montant fixe au temps pris par un algorithme O (log (N)) ("temps logarithmique"). " (Pas cher!)

    c log (N) → c log (2N) = (c log (2)) + ( c log (N) ) = (montant fixe) + ( c log (N) )

  • ... Je ne change pas le temps que prend un algorithme O (1) ("temps constant"). " (Le moins cher!)

    c * 1c * 1

  • ... Je double (fondamentalement) le temps que prend un algorithme O (N log (N)). " (Assez fréquent)

    c'est moins que O (N 1.000001 ), que vous pourriez vouloir appeler fondamentalement linéaire

  • ... j'augmente ridiculement le temps pris par un algorithme O (2 N ) ("temps exponentiel"). " (Vous doubleriez (ou sortingpleriez, etc.) le temps simplement en augmentant le problème par une seule unité)

    2 N → 2 2N = (4 N ) ............. autrement dit ... 2 N → 2 N + 1 = 2 N 2 1 = 2 2 N

[pour les maths enclins, vous pouvez passer la souris sur les spoilers pour les sidenotes mineurs]

(avec un crédit sur https://stackoverflow.com/a/487292/711085 )

(techniquement, le facteur constant pourrait avoir de l'importance dans certains exemples plus ésotériques, mais j'ai formulé des choses ci-dessus (par exemple, dans log (N)) de telle sorte que cela ne soit pas)

Ce sont les grosses commandes de croissance que les programmeurs et les informaticiens appliqués utilisent comme points de référence. Ils les voient tout le temps. (Donc, si vous pouviez techniquement penser que "doubler l’entrée rend un algorithme O (√N) 1,414 fois plus lent", il vaut mieux le considérer comme "pire que logarithmique mais meilleur que linéaire".)


Facteurs constants

Généralement, nous ne nous soucions pas des facteurs constants spécifiques, car ils n'affectent pas la croissance de la fonction. Par exemple, deux algorithmes peuvent prendre le temps O(N) pour terminer, mais l'un peut être deux fois plus lent que l'autre. En général, nous ne nous soucions pas trop, à moins que le facteur ne soit très important, puisque l'optimisation est une affaire délicate ( Quand l'optimisation est-elle prématurée? ); le simple fait de choisir un algorithme avec un meilleur big-O améliore souvent les performances par ordre de grandeur.

Certains algorithmes asymptotiquement supérieurs (par exemple, une non-comparaison O(N log(log(N))) sort) peut avoir un facteur constant si important (par exemple, 100000*N log(log(N)) ), ou surcharge relativement importante comme O(N log(log(N))) avec un + 100*N caché, qu'ils valent rarement la peine d'être utilisés même sur des "big data".


Pourquoi O (N) est parfois le meilleur que vous puissiez faire, c'est-à-dire pourquoi nous avons besoin de structures de données

O(N) algorithmes O(N) sont en quelque sorte les "meilleurs" algorithmes si vous devez lire toutes vos données. Le simple fait de lire un tas de données est une opération O(N) . Le chargement en mémoire est généralement O(N) (ou plus rapide si vous avez le support matériel, ou pas du tout si vous avez déjà lu les données). Cependant, si vous touchez ou même examinez chaque élément de données (ou même chaque autre donnée), votre algorithme prendra du temps O(N) pour effectuer cette recherche. Déterminez la durée de votre algorithme réel, ce sera au moins O(N) car il a passé tout ce temps à examiner toutes les données.

On peut dire la même chose pour l' acte même d'écrire . Tous les algorithmes qui impriment N choses prendront N fois parce que la sortie est au moins aussi longue (par exemple, imprimer toutes les permutations (façons de réorganiser) un ensemble de N cartes à jouer est factoriel: O(N!) ).

Cela motive l'utilisation de structures de données : une structure de données nécessite la lecture des données une seule fois (généralement l'heure O(N) ), ainsi qu'une quantité arbitraire de prétraitement (par exemple O(N) ou O(N log(N)) ou O(N²) ) que nous essayons de garder petit. Par la suite, la modification de la structure des données (insertions / suppressions / etc.) et des requêtes sur les données prend très peu de temps, par exemple O(1) ou O(log(N)) . Vous procédez ensuite à un grand nombre de requêtes! En général, plus vous êtes disposé à travailler à l’avance, moins vous aurez à faire plus tard.

Par exemple, supposons que vous disposiez des coordonnées de latitude et de longitude de millions de segments de routes et que vous souhaitiez rechercher toutes les intersections de rues.

  • Méthode naïve: Si vous aviez les coordonnées d'une intersection de rue et que vous vouliez examiner les rues avoisinantes, vous devrez parcourir les millions de segments à chaque fois et vérifier chacune d'entre elles pour déterminer la contiguïté.
  • Si vous n'aviez besoin de le faire qu'une seule fois, ce ne serait pas un problème de devoir utiliser la méthode naïve du travail O(N) une seule fois, mais si vous voulez le faire plusieurs fois (dans ce cas, N fois, une fois pour chaque segment), il faudrait faire O(N²) travail, ou 1000000² = 1000000000000 opérations. Pas bon (un ordinateur moderne peut effectuer environ un milliard d'opérations par seconde).
  • Si nous utilisons une structure simple appelée table de hachage (une table de recherche à vitesse instantanée, également appelée table de hachage ou dictionnaire), nous payons un petit coût en prétraitant tout en temps O(N) . Par la suite, il faut en moyenne un temps constant pour rechercher quelque chose par sa clé (dans ce cas, notre clé est la latitude et la longitude, arrondie en une grid; nous cherchons les grids adjacentes dont il n’ya que 9, ce qui est un constant).
  • Notre tâche est passée d'un O(N²) à un O(N) gérable, et tout ce que nous avions à faire était de payer un coût mineur pour fabriquer une table de hachage.
  • Analogie : L'analogie dans ce cas particulier est un casse-tête: Nous avons créé une structure de données qui exploite certaines propriétés des données. Si nos segments de route sont comme des pièces de casse-tête, nous les regroupons en faisant correspondre la couleur et le motif. Nous exploitons ensuite cela pour éviter de faire plus de travail plus tard (en comparant des pièces de puzzle de même couleur les unes aux autres, pas à toutes les autres pièces du puzzle).

La morale de l'histoire: une structure de données nous permet d'accélérer les opérations. Des structures de données encore plus avancées peuvent vous permettre de combiner, de retarder ou même d’ignorer des opérations de manière incroyablement intelligente. Différents problèmes auraient des analogies différentes, mais ils impliqueraient tous d'organiser les données de manière à exploiter une structure qui nous est chère ou que nous leur avons imposée artificiellement pour la comptabilité. Nous travaillons à l'avance (essentiellement en planifiant et en organisant), et maintenant les tâches répétées sont beaucoup plus faciles!


Exemple pratique: visualisation des ordres de croissance lors du codage

La notation asymptotique est, à sa base, assez distincte de la programmation. La notation asymptotique est un cadre mathématique permettant de réfléchir à la façon dont les choses évoluent et peut être utilisé dans de nombreux domaines différents. Cela dit, c'est comme ça que vous appliquez la notation asymptotique au codage.

Les bases: chaque fois que nous interagissons avec chaque élément d'une collection de taille A (tel qu'un tableau, un ensemble, toutes les clés d'une carte, etc.) ou effectuons des itérations d'une boucle, il s'agit d'un facteur multiplicatif de taille A Pourquoi est-ce que je dis "un facteur multiplicatif" - parce que les boucles et les fonctions (presque par définition) ont un temps d'exécution multiplicatif: le nombre d'itérations, le temps passé dans la boucle (ou pour les fonctions: le nombre de fois fonction, fois travail effectué dans la fonction). (Cela vaut si nous ne faisons rien de compliqué, comme sauter des boucles ou quitter la boucle plus tôt, ou modifier le stream de contrôle dans la fonction en fonction des arguments, ce qui est très courant). Voici quelques exemples de techniques de visualisation avec pseudocode.

(ici, les x représentent des unités de travail à temps constant, des instructions de processeur, des opcodes d'interpréteur, etc.)

 for(i=0; i A*1 --> O(A) time visualization: |<------ A ------->| 1 2 3 4 5 xx ... x other languages, multiplying orders of growth: javascript, O(A) time and space someListOfSizeA.map((x,i) => [x,i]) python, O(rows*cols) time and space [[r*c for c in range(cols)] for r in range(rows)] 

Exemple 2:

 for every x in listOfSizeA: // A x ... some O(1) operation // 1 some O(B) operation // B for every y in listOfSizeC: // C x ... some O(1) operation // 1 --> O(A*(1 + B + C)) O(A*(B+C)) (1 is dwarfed) visualization: |<------ A ------->| 1 xxxxxx ... x 2 xxxxxx ... x ^ 3 xxxxxx ... x | 4 xxxxxx ... x | 5 xxxxxx ... x B <-- A*B xxxxxxx ... x | ................... | xxxxxxx ... xv xxxxxxx ... x ^ xxxxxxx ... x | xxxxxxx ... x | xxxxxxx ... x C <-- A*C xxxxxxx ... x | ................... | xxxxxxx ... xv 

Exemple 3:

 function nSquaredFunction(n) { total = 0 for i in 1..n: // N x for j in 1..n: // N x total += i*k // 1 return total } // O(n^2) function nCubedFunction(a) { for i in 1..n: // A x print(nSquaredFunction(a)) // A^2 } // O(a^3) 

Si nous faisons quelque chose de légèrement compliqué, vous pouvez toujours imaginer visuellement ce qui se passe:

 for x in range(A): for y in range(1..x): simpleOperation(x*y) xxxxxxxxxx | xxxxxxxxx | xxxxxxxx | xxxxxxx | xxxxxx | xxxxx | xxxx | xxx | xx | x___________________| 

Ici, le plus petit contour reconnaissable que vous pouvez dessiner est ce qui compte; un sortingangle est une forme à deux dimensions (0,5 A ^ 2), tout comme un carré est une forme à deux dimensions (A ^ 2); le facteur constant de deux rest ici dans le rapport asymptotique entre les deux, mais nous l'ignorons comme tous les facteurs ... (Il y a des nuances malheureuses à cette technique que je n'entre pas ici; elle peut vous induire en erreur.)

Bien sûr, cela ne signifie pas que les boucles et les fonctions sont mauvaises; au contraire, ils sont les blocs de construction des langages de programmation modernes et nous les aimons. Cependant, nous pouvons voir que notre façon de tisser des boucles, des fonctions et des conditions avec nos données (stream de contrôle, etc.) imite l'utilisation de notre programme en termes de temps et d'espace! Si l’utilisation du temps et de l’espace devient un problème, c’est-à-dire lorsque nous avons recours à de Néanmoins, ces techniques de visualisation (même si elles ne fonctionnent pas toujours) peuvent vous donner une idée naïve du temps d'exécution le plus défavorable.

Voici une autre chose que nous pouvons reconnaître visuellement:

 <----------------------------- N -----------------------------> xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxx xxxxxxxxxxxxxxxx xxxxxxxx xxxx xx x 

Nous pouvons simplement réorganiser ceci et voir que c'est O (N):

 <----------------------------- N -----------------------------> xxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxx xxxxxxxxxxxxxxxx|xxxxxxxx|xxxx|xx|x 

Ou peut-être que vous faites des passes (N) de données pour O (N * log (N)) temps total:

  <----------------------------- N -----------------------------> ^ xxxxxxxxxxxxxxxx|xxxxxxxxxxxxxxxx | xxxxxxxx|xxxxxxxx|xxxxxxxx|xxxxxxxx lgN xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx|xxxx | xx|xx|xx|xx|xx|xx|xx|xx|xx|xx|xx|xx|xx|xx|xx|xx vx|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x|x 

Indépendamment mais mérite d'être mentionné à nouveau: Si nous effectuons un hachage (par exemple, un dictionnaire / recherche de table de hachage), c'est un facteur de O (1). C'est assez rapide.

 [myDictionary.has(x) for x in listOfSizeA] \----- O(1) ------/ --> A*1 --> O(A) 

Si nous faisons quelque chose de très compliqué, comme avec une fonction récursive ou un algorithme de division et de conquête, vous pouvez utiliser le théorème principal (fonctionne généralement) ou, dans des cas ridicules, le théorème d'Akra-Bazzi (fonctionne presque toujours) . temps d'exécution de votre algorithme sur Wikipedia.

Mais les programmeurs ne pensent pas comme ça parce que l’intuition des algorithmes devient finalement une seconde nature. Vous commencerez à coder quelque chose d'inefficace, et vous penserez immédiatement "est-ce que je fais quelque chose de grossièrement inefficace? ". Si la réponse est "oui" ET que vous prévoyez que cela compte, alors vous pouvez prendre du recul et penser à différentes astuces pour accélérer les choses (la réponse est presque toujours "utiliser une table de hachage", rarement "utiliser un arbre", et très rarement quelque chose de plus compliqué).


Amortissement et complexité moyenne

Il y a aussi le concept de "amorti" et / ou de "cas moyen" (notez que ceux-ci sont différents).

Cas moyen : Il ne s'agit que d'utiliser la notation big-O pour la valeur attendue d'une fonction, plutôt que la fonction elle-même. Dans le cas habituel où vous considérez que toutes les entrées sont également probables, le cas moyen correspond à la moyenne du temps d'exécution. Par exemple, avec quicksort, même si le pire cas est O(N^2) pour certaines entrées vraiment mauvaises, le cas moyen est le O(N log(N)) habituel O(N log(N)) (les entrées vraiment mauvaises sont très petites, donc peu que nous ne les remarquons pas dans le cas moyen).

Pire situation amortie : Certaines structures de données peuvent présenter une complexité importante, mais garantissent que si vous effectuez plusieurs de ces opérations, la quantité moyenne de travail que vous faites sera meilleure que dans le pire des cas. Par exemple, vous pouvez avoir une structure de données qui prend normalement un temps O(1) constant. However, occasionally it will 'hiccup' and take O(N) time for one random operation, because maybe it needs to do some bookkeeping or garbage collection or something... but it promises you that if it does hiccup, it won't hiccup again for N more operations. The worst-case cost is still O(N) per operation, but the amortized cost over many runs is O(N)/N = O(1) per operation. Because the big operations are sufficiently rare, the massive amount of occasional work can be considered to blend in with the rest of the work as a constant factor. We say the work is "amortized" over a sufficiently large number of calls that it disappears asymptotically.

The analogy for amortized analysis:

You drive a car. Occasionally, you need to spend 10 minutes going to the gas station and then spend 1 minute refilling the tank with gas. If you did this every time you went anywhere with your car (spend 10 minutes driving to the gas station, spend a few seconds filling up a fraction of a gallon), it would be very inefficient. But if you fill up the tank once every few days, the 11 minutes spent driving to the gas station is "amortized" over a sufficiently large number of sortingps, that you can ignore it and pretend all your sortingps were maybe 5% longer.

Comparison between average-case and amortized worst-case:

  • Average-case: We make some assumptions about our inputs; ie if our inputs have different probabilities, then our outputs/runtimes will have different probabilities (which we take the average of). Usually we assume that our inputs are all equally likely (uniform probability), but if the real-world inputs don't fit our assumptions of "average input", the average output/runtime calculations may be meaningless. If you anticipate uniformly random inputs though, this is useful to think about!
  • Amortized worst-case: If you use an amortized worst-case data structure, the performance is guaranteed to be within the amortized worst-case... eventually (even if the inputs are chosen by an evil demon who knows everything and is trying to screw you over). Usually we use this to analyze algorithms which may be very 'choppy' in performance with unexpected large hiccups, but over time perform just as well as other algorithms. (However unless your data structure has upper limits for much outstanding work it is willing to procrastinate on, an evil attacker could perhaps force you to catch up on the maximum amount of procrastinated work all-at-once.

Though, if you're reasonably worried about an attacker, there are many other algorithmic attack vectors to worry about besides amortization and average-case.)

Both average-case and amortization are incredibly useful tools for thinking about and designing with scaling in mind.

(See Difference between average case and amortized analysis if interestd on this subtopic.)


Multidimensional big-O

Most of the time, people don't realize that there's more than one variable at work. For example, in a ssortingng-search algorithm, your algorithm may take time O([length of text] + [length of query]) , ie it is linear in two variables like O(N+M) . Other more naive algorithms may be O([length of text]*[length of query]) or O(N*M) . Ignoring multiple variables is one of the most common oversights I see in algorithm analysis, and can handicap you when designing an algorithm.


The whole story

Keep in mind that big-O is not the whole story. You can drastically speed up some algorithms by using caching, making them cache-oblivious, avoiding bottlenecks by working with RAM instead of disk, using parallelization, or doing work ahead of time -- these techniques are often independent of the order-of-growth "big-O" notation, though you will often see the number of cores in the big-O notation of parallel algorithms.

Also keep in mind that due to hidden constraints of your program, you might not really care about asymptotic behavior. You may be working with a bounded number of values, for example:

  • If you're sorting something like 5 elements, you don't want to use the speedy O(N log(N)) quicksort; you want to use insertion sort, which happens to perform well on small inputs. These situations often comes up in divide-and-conquer algorithms, where you split up the problem into smaller and smaller subproblems, such as recursive sorting, fast Fourier transforms, or masortingx multiplication.
  • If some values are effectively bounded due to some hidden fact (eg the average human name is softly bounded at perhaps 40 letters, and human age is softly bounded at around 150). You can also impose bounds on your input to effectively make terms constant.

In practice, even among algorithms which have the same or similar asymptotic performance, their relative merit may actually be driven by other things, such as: other performance factors (quicksort and mergesort are both O(N log(N)) , but quicksort takes advantage of CPU caches); non-performance considerations, like ease of implementation; whether a library is available, and how reputable and maintained the library is.

Programs will also run slower on a 500MHz computer vs 2GHz computer. We don't really consider this as part of the resource bounds, because we think of the scaling in terms of machine resources (eg per clock cycle), not per real second. However, there are similar things which can 'secretly' affect performance, such as whether you are running under emulation, or whether the comstackr optimized code or not. These might make some basic operations take longer (even relative to each other), or even speed up or slow down some operations asymptotically (even relative to each other). The effect may be small or large between different implementation and/or environment. Do you switch languages or machines to eke out that little extra work? That depends on a hundred other reasons (necessity, skills, coworkers, programmer productivity, the monetary value of your time, familiarity, workarounds, why not assembly or GPU, etc...), which may be more important than performance.

The above issues, like programming language, are almost never considered as part of the constant factor (nor should they be); yet one should be aware of them, because sometimes (though rarely) they may affect things. For example in cpython, the native priority queue implementation is asymptotically non-optimal ( O(log(N)) rather than O(1) for your choice of insertion or find-min); do you use another implementation? Probably not, since the C implementation is probably faster, and there are probably other similar issues elsewhere. There are tradeoffs; sometimes they matter and sometimes they don't.


( edit : The "plain English" explanation ends here.)

Math addenda

For completeness, the precise definition of big-O notation is as follows: f(x) ∈ O(g(x)) means that "f is asymptotically upper-bounded by const*g": ignoring everything below some finite value of x, there exists a constant such that |f(x)| ≤ const * |g(x)| . (The other symbols are as follows: just like O means ≤, Ω means ≥. There are lowercase variants: o means <, and ω means >.) f(x) ∈ Ɵ(g(x)) means both f(x) ∈ O(g(x)) and f(x) ∈ Ω(g(x)) (upper- and lower-bounded by g): there exists some constants such that f will always lie in the "band" between const1*g(x) and const2*g(x) . It is the strongest asymptotic statement you can make and roughly equivalent to == . (Sorry, I elected to delay the mention of the absolute-value symbols until now, for clarity's sake; especially because I have never seen negative values come up in a computer science context.)

People will often use = O(...) , which is perhaps the more correct 'comp-sci' notation, and entirely legitimate to use... but one should realize = is not being used as equality; it is a compound notation that must be read as an idiom. I was taught to use the more rigorous ∈ O(...) . means "is an element of". O(N²) is actually an equivalence class , that is, it is a set of things which we consider to be the same. In this particular case, O(N²) contains elements like { 2 N² , 3 N² , 1/2 N² , 2 N² + log(N) , - N² + N^1.9 , ...} and is infinitely large, but it's still a set. The = notation might be the more common one, and is even used in papers by world-renowned computer scientists. Additionally, it is often the case that in a casual setting, people will say O(...) when they mean Ɵ(...) ; this is technically true since the set of things Ɵ(exactlyThis) is a subset of O(noGreaterThanThis) ... and it's easier to type. 😉

EDIT: Quick note, this is almost certainly confusing Big O notation (which is an upper bound) with Theta notation (which is both an upper and lower bound). In my experience this is actually typical of discussions in non-academic settings. Apologies for any confusion caused.

In one sentence: As the size of your job goes up, how much longer does it take to complete it?

Obviously that’s only using “size” as the input and “time taken” as the output — the same idea applies if you want to talk about memory usage etc.

Here’s an example where we have N T-shirts which we want to dry. We’ll assume it’s incredibly quick to get them in the drying position (ie the human interaction is negligible). That’s not the case in real life, of course…

  • Using a washing line outside: assuming you have an infinitely large back yard, washing dries in O(1) time. However much you have of it, it’ll get the same sun and fresh air, so the size doesn’t affect the drying time.

  • Using a tumble dryer: you put 10 shirts in each load, and then they’re done an hour later. (Ignore the actual numbers here — they’re irrelevant.) So drying 50 shirts takes about 5 times as long as drying 10 shirts.

  • Putting everything in an airing cupboard: If we put everything in one big stack and just let general warmth do it, it will take a long time for the middle shirts to get dry. I wouldn’t like to guess at the detail, but I suspect this is at least O(N^2) — as you increase the wash load, the drying time increases faster.

One important aspect of “big O” notation is that it doesn’t say which algorithm will be faster for a given size. Take a hashtable (ssortingng key, integer value) vs an array of pairs (ssortingng, integer). Is it faster to find a key in the hashtable or an element in the array, based on a ssortingng? (ie for the array, “find the first element where the ssortingng part matches the given key.”) Hashtables are generally amortised (~= “on average”) O(1) — once they’re set up, it should take about the same time to find an entry in a 100 entry table as in a 1,000,000 entry table. Finding an element in an array (based on content rather than index) is linear, ie O(N) — on average, you’re going to have to look at half the ensortinges.

Does this make a hashtable faster than an array for lookups? Pas nécessairement. If you’ve got a very small collection of ensortinges, an array may well be faster — you may be able to check all the ssortingngs in the time that it takes to just calculate the hashcode of the one you’re looking at. As the data set grows larger, however, the hashtable will eventually beat the array.

Big O describes an upper limit on the growth behaviour of a function, for example the runtime of a program, when inputs become large.

Exemples:

  • O(n): If I double the input size the runtime doubles

  • O(n 2 ): If the input size doubles the runtime quadruples

  • O(log n): If the input size doubles the runtime increases by one

  • O(2 n ): If the input size increases by one, the runtime doubles

The input size is usually the space in bits needed to represent the input.

Big O notation is most commonly used by programmers as an approximate measure of how long a computation (algorithm) will take to complete expressed as a function of the size of the input set.

Big O is useful to compare how well two algorithms will scale up as the number of inputs is increased.

More precisely Big O notation is used to express the asymptotic behavior of a function. That means how the function behaves as it approaches infinity.

In many cases the “O” of an algorithm will fall into one of the following cases:

  • O(1) – Time to complete is the same regardless of the size of input set. An example is accessing an array element by index.
  • O(Log N) – Time to complete increases roughly in line with the log2(n). For example 1024 items takes roughly twice as long as 32 items, because Log2(1024) = 10 and Log2(32) = 5. An example is finding an item in a binary search tree (BST).
  • O(N) – Time to complete that scales linearly with the size of the input set. In other words if you double the number of items in the input set, the algorithm takes roughly twice as long. An example is counting the number of items in a linked list.
  • O(N Log N) – Time to complete increases by the number of items times the result of Log2(N). An example of this is heap sort and quick sort .
  • O(N^2) – Time to complete is roughly equal to the square of the number of items. An example of this is bubble sort .
  • O(N!) – Time to complete is the factorial of the input set. An example of this is the traveling salesman problem brute-force solution .

Big O ignores factors that do not consortingbute in a meaningful way to the growth curve of a function as the input size increases towards infinity. This means that constants that are added to or multiplied by the function are simply ignored.

Big O is just a way to “Express” yourself in a common way, “How much time / space does it take to run my code?”.

You may often see O(n), O(n 2 ), O(nlogn) and so forth, all these are just ways to show; How does an algorithm change?

O(n) means Big O is n, and now you might think, “What is n!?” Well “n” is the amount of elements. Imaging you want to search for an Item in an Array. You would have to look on Each element and as “Are you the correct element/item?” in the worst case, the item is at the last index, which means that it took as much time as there are items in the list, so to be generic, we say “oh hey, n is a fair given amount of values!”.

So then you might understand what “n 2 ” means, but to be even more specific, play with the thought you have a simple, the simpliest of the sorting algorithms; bubblesort. This algorithm needs to look through the whole list, for each item.

My list

  1. 1
  2. 6
  3. 3

The flow here would be:

  • Compare 1 and 6, which is biggest? Ok 6 is in the right position, moving forward!
  • Compare 6 and 3, oh, 3 is less! Let’s move that, Ok the list changed, we need to start from the begining now!

This is O n 2 because, you need to look at all items in the list there are “n” items. For each item, you look at all items once more, for comparing, this is also “n”, so for every item, you look “n” times meaning n*n = n 2

I hope this is as simple as you want it.

But remember, Big O is just a way to experss yourself in the manner of time and space.

Big O describes the fundamental scaling nature of an algorithm.

There is a lot of information that Big O does not tell you about a given algorithm. It cuts to the bone and gives only information about the scaling nature of an algorithm, specifically how the resource use (think time or memory) of an algorithm scales in response to the “input size”.

Consider the difference between a steam engine and a rocket. They are not merely different varieties of the same thing (as, say, a Prius engine vs. a Lamborghini engine) but they are dramatically different kinds of propulsion systems, at their core. A steam engine may be faster than a toy rocket, but no steam piston engine will be able to achieve the speeds of an orbital launch vehicle. This is because these systems have different scaling characteristics with regards to the relation of fuel required (“resource usage”) to reach a given speed (“input size”).

Why is this so important? Because software deals with problems that may differ in size by factors up to a sortingllion. Consider that for a moment. The ratio between the speed necessary to travel to the Moon and human walking speed is less than 10,000:1, and that is absolutely tiny compared to the range in input sizes software may face. And because software may face an astronomical range in input sizes there is the potential for the Big O complexity of an algorithm, it’s fundamental scaling nature, to trump any implementation details.

Consider the canonical sorting example. Bubble-sort is O(n 2 ) while merge-sort is O(n log n). Let’s say you have two sorting applications, application A which uses bubble-sort and application B which uses merge-sort, and let’s say that for input sizes of around 30 elements application A is 1,000x faster than application B at sorting. If you never have to sort much more than 30 elements then it’s obvious that you should prefer application A, as it is much faster at these input sizes. However, if you find that you may have to sort ten million items then what you’d expect is that application B actually ends up being thousands of times faster than application A in this case, entirely due to the way each algorithm scales.

Here is the plain English bestiary I tend to use when explaining the common varieties of Big-O

In all cases, prefer algorithms higher up on the list to those lower on the list. However, the cost of moving to a more expensive complexity class varies significantly.

O(1):

No growth. Regardless of how big as the problem is, you can solve it in the same amount of time. This is somewhat analogous to broadcasting where it takes the same amount of energy to broadcast over a given distance, regardless of the number of people that lie within the broadcast range.

O(log n ):

This complexity is the same as O(1) except that it’s just a little bit worse. For all practical purposes, you can consider this as a very large constant scaling. The difference in work between processing 1 thousand and 1 billion items is only a factor six.

O( n ):

The cost of solving the problem is proportional to the size of the problem. If your problem doubles in size, then the cost of the solution doubles. Since most problems have to be scanned into the computer in some way, as data entry, disk reads, or network traffic, this is generally an affordable scaling factor.

O( n log n ):

This complexity is very similar to O( n ) . For all practical purposes, the two are equivalent. This level of complexity would generally still be considered scalable. By tweaking assumptions some O( n log n ) algorithms can be transformed into O( n ) algorithms. For example, bounding the size of keys reduces sorting from O( n log n ) to O( n ) .

O( n 2 ):

Grows as a square, where n is the length of the side of a square. This is the same growth rate as the “network effect”, where everyone in a network might know everyone else in the network. Growth is expensive. Most scalable solutions cannot use algorithms with this level of complexity without doing significant gymnastics. This generally applies to all other polynomial complexities – O( n k ) – as well.

O(2 n ):

Does not scale. You have no hope of solving any non-sortingvially sized problem. Useful for knowing what to avoid, and for experts to find approximate algorithms which are in O( n k ) .

Big O is a measure of how much time/space an algorithm uses relative to the size of its input.

If an algorithm is O(n) then the time/space will increase at the same rate as its input.

If an algorithm is O(n 2 ) then the time/space increase at the rate of its input squared.

and so on.

It is very difficult to measure the speed of software programs, and when we try, the answers can be very complex and filled with exceptions and special cases. This is a big problem, because all those exceptions and special cases are distracting and unhelpful when we want to compare two different programs with one another to find out which is “fastest”.

As a result of all this unhelpful complexity, people try to describe the speed of software programs using the smallest and least complex (mathematical) expressions possible. These expressions are very very crude approximations: Although, with a bit of luck, they will capture the “essence” of whether a piece of software is fast or slow.

Because they are approximations, we use the letter “O” (Big Oh) in the expression, as a convention to signal to the reader that we are making a gross oversimplification. (And to make sure that nobody mistakenly thinks that the expression is in any way accurate).

If you read the “Oh” as meaning “on the order of” or “approximately” you will not go too far wrong. (I think the choice of the Big-Oh might have been an attempt at humour).

The only thing that these “Big-Oh” expressions try to do is to describe how much the software slows down as we increase the amount of data that the software has to process. If we double the amount of data that needs to be processed, does the software need twice as long to finish it’s work? Ten times as long? In practice, there are a very limited number of big-Oh expressions that you will encounter and need to worry about:

The good:

  • O(1) Constant : The program takes the same time to run no matter how big the input is.
  • O(log n) Logarithmic : The program run-time increases only slowly, even with big increases in the size of the input.

The bad:

  • O(n) Linear : The program run-time increases proportionally to the size of the input.
  • O(n^k) Polynomial : – Processing time grows faster and faster – as a polynomial function – as the size of the input increases.

… and the ugly:

  • O(k^n) Exponential The program run-time increases very quickly with even moderate increases in the size of the problem – it is only practical to process small data sets with exponential algorithms.
  • O(n!) Factorial The program run-time will be longer than you can afford to wait for anything but the very smallest and most sortingvial-seeming datasets.

What is a plain English explanation of Big O? With as little formal definition as possible and simple mathematics.

A Plain English Explanation of the Need for Big-O Notation:

When we program, we are trying to solve a problem. What we code is called an algorithm. Big O notation allows us to compare the worse case performance of our algorithms in a standardized way. Hardware specs vary over time and improvements in hardware can reduce the time it takes an algorithms to run. But replacing the hardware does not mean our algorithm is any better or improved over time, as our algorithm is still the same. So in order to allow us to compare different algorithms, to determine if one is better or not, we use Big O notation.

A Plain English Explanation of What Big O Notation is:

Not all algorithms run in the same amount of time, and can vary based on the number of items in the input, which we’ll call n . Based on this, we consider the worse case analysis, or an upper-bound of the run-time as n get larger and larger. We must be aware of what n is, because many of the Big O notations reference it.

A simple straightforward answer can be:

Big O represents the worst possible time/space for that algorithm. The algorithm will never take more space/time above that limit. Big O represents time/space complexity in the extreme case.

Ok, my 2cents.

Big-O, is rate of increase of resource consumed by program, wrt problem-instance-size

Resource : Could be total-CPU time, could be maximum RAM space. By default refers to CPU time.

Say the problem is “Find the sum”,

 int Sum(int*arr,int size){ int sum=0; while(size-->0) sum+=arr[size]; return sum; } 

problem-instance= {5,10,15} ==> problem-instance-size = 3, iterations-in-loop= 3

problem-instance= {5,10,15,20,25} ==> problem-instance-size = 5 iterations-in-loop = 5

For input of size “n” the program is growing at speed of “n” iterations in array. Hence Big-O is N expressed as O(n)

Say the problem is “Find the Combination”,

  void Combination(int*arr,int size) { int outer=size,inner=size; while(outer -->0) { inner=size; while(inner -->0) cout< 

problem-instance= {5,10,15} ==> problem-instance-size = 3, total-iterations = 3*3 = 9

problem-instance= {5,10,15,20,25} ==> problem-instance-size = 5, total-iterations= 5*5 =25

For input of size "n" the program is growing at speed of "n*n" iterations in array. Hence Big-O is N 2 expressed as O(n 2 )

Big O notation is a way of describing the upper bound of an algorithm in terms of space or running time. The n is the number of elements in the the problem (ie size of an array, number of nodes in a tree, etc.) We are interestd in describing the running time as n gets big.

When we say some algorithm is O(f(n)) we are saying that the running time (or space required) by that algorithm is always lower than some constant times f(n).

To say that binary search has a running time of O(logn) is to say that there exists some constant c which you can multiply log(n) by that will always be larger than the running time of binary search. In this case you will always have some constant factor of log(n) comparisons.

In other words where g(n) is the running time of your algorithm, we say that g(n) = O(f(n)) when g(n) <= c*f(n) when n > k, where c and k are some constants.

What is a plain English explanation of Big O? With as little formal definition as possible and simple mathematics.

Such a beautifully simple and short question seems at least to deserve an equally short answer, like a student might receive during tutoring.

Big O notation simply tells how much time* an algorithm can run within, in terms of only the amount of input data **.

( *in a wonderful, unit-free sense of time!)
(**which is what matters, because people will always want more , whether they live today or tomorrow)

Well, what’s so wonderful about Big O notation if that’s what it does?

  • Practically speaking, Big O analysis is so useful and important because Big O puts the focus squarely on the algorithm’s own complexity and completely ignores anything that is merely a proportionality constant—like a JavaScript engine, the speed of a CPU, your Internet connection, and all those things which become quickly become as laughably outdated as a Model T . Big O focuses on performance only in the way that matters equally as much to people living in the present or in the future.

  • Big O notation also shines a spotlight directly on the most important principle of computer programming/engineering, the fact which inspires all good programmers to keep thinking and dreaming: the only way to achieve results beyond the slow forward march of technology is to invent a better algorithm .

Algorithm example (Java):

 // given a list of integers L, and an integer K public boolean simple_search(List L, Integer K) { // for each integer i in list L for (Integer i : L) { // if i is equal to K if (i == K) { return true; } } return false; } 

Algorithm description:

  • This algorithm search a list, item by item, looking for a key,

  • Iterating on each item in the list, if it’s the key then return True,

  • If the loop has finished without finding the key, return False.

Big-O notation represent the upper-bound on the Complexity (Time, Space, ..)

To find The Big-O on Time Complexity:

  • Calculate how much time (regarding input size) the worst case takes:

  • Worst-Case: the key doesn’t exist in the list.

  • Time(Worst-Case) = 4n+1

  • Time: O(4n+1) = O(n) | in Big-O, constants are neglected

  • O(n) ~ Linear

There’s also Big-Omega, which represent complexity of the Best-Case:

  • Best-Case: the key is the first item.

  • Time(Best-Case) = 4

  • Time: Ω(4) = O(1) ~ Instant\Constant

Big O

f (x) = O( g (x)) when x goes to a (for example, a = +∞) means that there is a function k such that:

  1. f (x) = k (x) g (x)

  2. k is bounded in some neighborhood of a (if a = +∞, this means that there are numbers N and M such that for every x > N, | k (x)| < M).

In other words, in plain English: f (x) = O( g (x)), x → a, means that in a neighborhood of a, f decomposes into the product of g and some bounded function.

Small o

By the way, here is for comparison the definition of small o.

f (x) = o( g (x)) when x goes to a means that there is a function k such that:

  1. f (x) = k (x) g (x)

  2. k (x) goes to 0 when x goes to a.

Examples

  • sin x = O(x) when x → 0.

  • sin x = O(1) when x → +∞,

  • x 2 + x = O(x) when x → 0,

  • x 2 + x = O(x 2 ) when x → +∞,

  • ln(x) = o(x) = O(x) when x → +∞.

Attention! The notation with the equal sign “=” uses a “fake equality”: it is true that o(g(x)) = O(g(x)), but false that O(g(x)) = o(g(x)). Similarly, it is ok to write “ln(x) = o(x) when x → +∞”, but the formula “o(x) = ln(x)” would make no sense.

More examples

  • O(1) = O(n) = O(n 2 ) when n → +∞ (but not the other way around, the equality is “fake”),

  • O(n) + O(n 2 ) = O(n 2 ) when n → +∞

  • O(O(n 2 )) = O(n 2 ) when n → +∞

  • O(n 2 )O(n 3 ) = O(n 5 ) when n → +∞


Here is the Wikipedia article: https://en.wikipedia.org/wiki/Big_O_notation

Big O notation is a way of describing how quickly an algorithm will run given an arbitrary number of input parameters, which we’ll call “n”. It is useful in computer science because different machines operate at different speeds, and simply saying that an algorithm takes 5 seconds doesn’t tell you much because while you may be running a system with a 4.5 Ghz octo-core processor, I may be running a 15 year old, 800 Mhz system, which could take longer regardless of the algorithm. So instead of specifying how fast an algorithm runs in terms of time, we say how fast it runs in terms of number of input parameters, or “n”. By describing algorithms in this way, we are able to compare the speeds of algorithms without having to take into account the speed of the computer itself.

Not sure I’m further consortingbuting to the subject but still thought I’d share: I once found this blog post to have some quite helpful (though very basic) explanations & examples on Big O:

Via examples, this helped get the bare basics into my tortoiseshell-like skull, so I think it’s a pretty descent 10-minute read to get you headed in the right direction.

You want to know all there is to know of big O? So do I.

So to talk of big O, I will use words that have just one beat in them. One sound per word. Small words are quick. You know these words, and so do I. We will use words with one sound. They are small. I am sure you will know all of the words we will use!

Now, let’s you and me talk of work. Most of the time, I do not like work. Do you like work? It may be the case that you do, but I am sure I do not.

I do not like to go to work. I do not like to spend time at work. If I had my way, I would like just to play, and do fun things. Do you feel the same as I do?

Now at times, I do have to go to work. It is sad, but true. So, when I am at work, I have a rule: I try to do less work. As near to no work as I can. Then I go play!

So here is the big news: the big O can help me not to do work! I can play more of the time, if I know big O. Less work, more play! That is what big O helps me do.

Now I have some work. I have this list: one, two, three, four, five, six. I must add all things in this list.

Wow, I hate work. But oh well, I have to do this. So here I go.

One plus two is three… plus three is six… and four is… I don’t know. I got lost. It is too hard for me to do in my head. I don’t much care for this kind of work.

So let’s not do the work. Let’s you and me just think how hard it is. How much work would I have to do, to add six numbers?

Well, let’s see. I must add one and two, and then add that to three, and then add that to four… All in all, I count six adds. I have to do six adds to solve this.

Here comes big O, to tell us just how hard this math is.

Big O says: we must do six adds to solve this. One add, for each thing from one to six. Six small bits of work… each bit of work is one add.

Well, I will not do the work to add them now. But I know how hard it would be. It would be six adds.

Oh no, now I have more work. Sheesh. Who makes this kind of stuff?!

Now they ask me to add from one to ten! Why would I do that? I did not want to add one to six. To add from one to ten… well… that would be even more hard!

How much more hard would it be? How much more work would I have to do? Do I need more or less steps?

Well, I guess I would have to do ten adds… one for each thing from one to ten. Ten is more than six. I would have to work that much more to add from one to ten, than one to six!

I do not want to add right now. I just want to think on how hard it might be to add that much. And, I hope, to play as soon as I can.

To add from one to six, that is some work. But do you see, to add from one to ten, that is more work?

Big O is your friend and mine. Big O helps us think on how much work we have to do, so we can plan. And, if we are friends with big O, he can help us choose work that is not so hard!

Now we must do new work. Oh non. I don’t like this work thing at all.

The new work is: add all things from one to n.

Attendez! What is n? Did I miss that? How can I add from one to n if you don’t tell me what n is?

Well, I don’t know what n is. I was not told. Were you? Non? Tant pis. So we can’t do the work. Ouf.

But though we will not do the work now, we can guess how hard it would be, if we knew n. We would have to add up n things, right? Bien sûr!

Now here comes big O, and he will tell us how hard this work is. He says: to add all things from one to N, one by one, is O(n). To add all these things, [I know I must add n times.][1] That is big O! He tells us how hard it is to do some type of work.

To me, I think of big O like a big, slow, boss man. He thinks on work, but he does not do it. He might say, “That work is quick.” Or, he might say, “That work is so slow and hard!” But he does not do the work. He just looks at the work, and then he tells us how much time it might take.

I care lots for big O. Why? I do not like to work! No one likes to work. That is why we all love big O! He tells us how fast we can work. He helps us think of how hard work is.

Uh oh, more work. Now, let’s not do the work. But, let’s make a plan to do it, step by step.

They gave us a deck of ten cards. They are all mixed up: seven, four, two, six… not straight at all. And now… our job is to sort them.

Ergh. That sounds like a lot of work!

How can we sort this deck? I have a plan.

I will look at each pair of cards, pair by pair, through the deck, from first to last. If the first card in one pair is big and the next card in that pair is small, I swap them. Else, I go to the next pair, and so on and so on… and soon, the deck is done.

When the deck is done, I ask: did I swap cards in that pass? If so, I must do it all once more, from the top.

At some point, at some time, there will be no swaps, and our sort of the deck would be done. So much work!

Well, how much work would that be, to sort the cards with those rules?

I have ten cards. And, most of the time — that is, if I don’t have lots of luck — I must go through the whole deck up to ten times, with up to ten card swaps each time through the deck.

Big O, help me!

Big O comes in and says: for a deck of n cards, to sort it this way will be done in O(N squared) time.

Why does he say n squared?

Well, you know n squared is n times n. Now, I get it: n cards checked, up to what might be n times through the deck. That is two loops, each with n steps. That is n squared much work to be done. A lot of work, for sure!

Now when big O says it will take O(n squared) work, he does not mean n squared adds, on the nose. It might be some small bit less, for some case. But in the worst case, it will be near n squared steps of work to sort the deck.

Now here is where big O is our friend.

Big O points out this: as n gets big, when we sort cards, the job gets MUCH MUCH MORE HARD than the old just-add-these-things job. How do we know this?

Well, if n gets real big, we do not care what we might add to n or n squared.

For big n, n squared is more large than n.

Big O tells us that to sort things is more hard than to add things. O(n squared) is more than O(n) for big n. That means: if n gets real big, to sort a mixed deck of n things MUST take more time, than to just add n mixed things.

Big O does not solve the work for us. Big O tells us how hard the work is.

I have a deck of cards. I did sort them. You helped. Merci.

Is there a more fast way to sort the cards? Can big O help us?

Yes, there is a more fast way! It takes some time to learn, but it works… and it works quite fast. You can try it too, but take your time with each step and do not lose your place.

In this new way to sort a deck, we do not check pairs of cards the way we did a while ago. Here are your new rules to sort this deck:

One: I choose one card in the part of the deck we work on now. You can choose one for me if you like. (The first time we do this, “the part of the deck we work on now” is the whole deck, of course.)

Two: I splay the deck on that card you chose. What is this splay; how do I splay? Well, I go from the start card down, one by one, and I look for a card that is more high than the splay card.

Three: I go from the end card up, and I look for a card that is more low than the splay card.

Once I have found these two cards, I swap them, and go on to look for more cards to swap. That is, I go back to step Two, and splay on the card you chose some more.

At some point, this loop (from Two to Three) will end. It ends when both halves of this search meet at the splay card. Then, we have just splayed the deck with the card you chose in step One. Now, all the cards near the start are more low than the splay card; and the cards near the end are more high than the splay card. Cool sortingck!

Four (and this is the fun part): I have two small decks now, one more low than the splay card, and one more high. Now I go to step one, on each small deck! That is to say, I start from step One on the first small deck, and when that work is done, I start from step One on the next small deck.

I break up the deck in parts, and sort each part, more small and more small, and at some time I have no more work to do. Now this may seem slow, with all the rules. But trust me, it is not slow at all. It is much less work than the first way to sort things!

What is this sort called? It is called Quick Sort! That sort was made by a man called CAR Hoare and he called it Quick Sort. Now, Quick Sort gets used all the time!

Quick Sort breaks up big decks in small ones. That is to say, it breaks up big tasks in small ones.

Hmmm. There may be a rule in there, I think. To make big tasks small, break them up.

This sort is quite quick. How quick? Big O tells us: this sort needs O(n log n) work to be done, in the mean case.

Is it more or less fast than the first sort? Big O, please help!

The first sort was O(n squared). But Quick Sort is O(n log n). You know that n log n is less than n squared, for big n, right? Well, that is how we know that Quick Sort is fast!

If you have to sort a deck, what is the best way? Well, you can do what you want, but I would choose Quick Sort.

Why do I choose Quick Sort? I do not like to work, of course! I want work done as soon as I can get it done.

How do I know Quick Sort is less work? I know that O(n log n) is less than O(n squared). The O’s are more small, so Quick Sort is less work!

Now you know my friend, Big O. He helps us do less work. And if you know big O, you can do less work too!

You learned all that with me! You are so smart! Merci beaucoup!

Now that work is done, let’s go play!


[1]: There is a way to cheat and add all the things from one to n, all at one time. Some kid named Gauss found this out when he was eight. I am not that smart though, so don’t ask me how he did it .

Assume we’re talking about an algorithm A , which should do something with a dataset of size n .

Then O( ) means, in simple English:

If you’re unlucky when executing A, it might take X(n) operations to complete.

As it happens, there are certain functions (think of them as implementations of X(n) ) that tend to occur quite often. These are well known and easily compared (Examples: 1 , Log N , N , N^2 , N! , etc..)

By comparing these when talking about A and other algorithms, it is easy to rank the algorithms according to the number of operations they may (worst-case) require to complete.

In general, our goal will be to find or structure an algorithm A in such a way that it will have a function X(n) that returns as low a number as possible.

I’ve more simpler way to understand the time complexity he most common mesortingc for calculating time complexity is Big O notation. This removes all constant factors so that the running time can be estimated in relation to N as N approaches infinity. In general you can think of it like this:

 statement; 

Is constant. The running time of the statement will not change in relation to N

 for ( i = 0; i < N; i++ ) statement; 

Is linear. The running time of the loop is directly proportional to N. When N doubles, so does the running time.

 for ( i = 0; i < N; i++ ) { for ( j = 0; j < N; j++ ) statement; } 

Is quadratic. The running time of the two loops is proportional to the square of N. When N doubles, the running time increases by N * N.

 while ( low <= high ) { mid = ( low + high ) / 2; if ( target < list[mid] ) high = mid - 1; else if ( target > list[mid] ) low = mid + 1; else break; } 

Is logarithmic. The running time of the algorithm is proportional to the number of times N can be divided by 2. This is because the algorithm divides the working area in half with each iteration.

 void quicksort ( int list[], int left, int right ) { int pivot = partition ( list, left, right ); quicksort ( list, left, pivot - 1 ); quicksort ( list, pivot + 1, right ); } 

Is N * log ( N ). The running time consists of N loops (iterative or recursive) that are logarithmic, thus the algorithm is a combination of linear and logarithmic.

In general, doing something with every item in one dimension is linear, doing something with every item in two dimensions is quadratic, and dividing the working area in half is logarithmic. There are other Big O measures such as cubic, exponential, and square root, but they're not nearly as common. Big O notation is described as O ( ) where is the measure. The quicksort algorithm would be described as O ( N * log ( N ) ).

Note: None of this has taken into account best, average, and worst case measures. Each would have its own Big O notation. Also note that this is a VERY simplistic explanation. Big O is the most common, but it's also more complex that I've shown. There are also other notations such as big omega, little o, and big theta. You probably won't encounter them outside of an algorithm analysis course.

  • See more at: Here

Say you order Harry Potter: Complete 8-Film Collection [Blu-ray] from Amazon and download the same film collection online at the same time. You want to test which method is faster. The delivery takes almost a day to arrive and the download completed about 30 minutes earlier. Génial! So it’s a tight race.

What if I order several Blu-ray movies like The Lord of the Rings, Twilight, The Dark Knight Trilogy, etc. and download all the movies online at the same time? This time, the delivery still take a day to complete, but the online download takes 3 days to finish. For online shopping, the number of purchased item (input) doesn’t affect the delivery time. The output is constant. We call this O(1) .

For online downloading, the download time is directly proportional to the movie file sizes (input). We call this O(n) .

From the experiments, we know that online shopping scales better than online downloading. It is very important to understand big O notation because it helps you to analyze the scalability and efficiency of algorithms.

Note: Big O notation represents the worst-case scenario of an algorithm. Let’s assume that O(1) and O(n) are the worst-case scenarios of the example above.

Reference : http://carlcheo.com/compsci

If you have a suitable notion of infinity in your head, then there is a very brief description:

Big O notation tells you the cost of solving an infinitely large problem.

And furthermore

Constant factors are negligible

If you upgrade to a computer that can run your algorithm twice as fast, big O notation won’t notice that. Constant factor improvements are too small to even be noticed in the scale that big O notation works with. Note that this is an intentional part of the design of big O notation.

Although anything “larger” than a constant factor can be detected, however.

When interestd in doing computations whose size is “large” enough to be considered as approximately infinity, then big O notation is approximately the cost of solving your problem.


If the above doesn’t make sense, then you don’t have a compatible intuitive notion of infinity in your head, and you should probably disregard all of the above; the only way I know to make these ideas rigorous, or to explain them if they aren’t already intuitively useful, is to first teach you big O notation or something similar. (although, once you well understand big O notation in the future, it may be worthwhile to revisit these ideas)

Simplest way to look at it (in plain English)

We are trying to see how the number of input parameters, affects the running time of an algorithm. If the running time of your application is proportional to the number of input parameters, then it is said to be in Big O of n.

The above statement is a good start but not completely true.

A more accurate explanation (mathematical)

Supposer

n=number of input parameters

T(n)= The actual function that expresses the running time of the algorithm as a function of n

c= a constant

f(n)= An approximate function that expresses the running time of the algorithm as a function of n

Then as far as Big O is concerned, the approximation f(n) is considered good enough as long as the below condition is true.

 lim T(n) ≤ c×f(n) n→∞ 

The equation is read as As n approaches infinity, T of n, is less than or equal to c times f of n.

In big O notation this is written as

 T(n)∈O(n) 

This is read as T of n is in big O of n.

Back to English

Based on the mathematical definition above, if you say your algorithm is a Big O of n, it means it is a function of n (number of input parameters) or faster . If your algorithm is Big O of n, then it is also automatically the Big O of n square.

Big O of n means my algorithm runs at least as fast as this. You cannot look at Big O notation of your algorithm and say its slow. You can only say its fast.

Check this out for a video tutorial on Big O from UC Berkley. It’s is actually a simple concept. If you hear professor Shewchuck (aka God level teacher) explaining it, you will say “Oh that’s all it is!”.

What is a plain English explanation of “Big O” notation?

Very Quick Note:

The O in “Big O” refers to as “Order”(or precisely “order of”)
so you could get its idea literally that it’s used to order something to compare them.

  • “Big O” does two things:

    1. Estimates how many steps of the method your computer applies to accomplish a task.
    2. Facilitate the process to compare with others in order to determine whether it’s good or not?
    3. “Big O’ achieves the above two with standardized Notations .
  • There are seven most used notations

    1. O(1), means your computer gets a task done with 1 step, it’s excellent, Ordered No.1
    2. O(logN), means your computer complete a task with logN steps, its good, Ordered No.2
    3. O(N), finish a task with N steps, its fair, Order No.3
    4. O(NlogN), ends a task with O(NlogN) steps, it’s not good, Order No.4
    5. O(N^2), get a task done with N^2 steps, it’s bad, Order No.5
    6. O(2^N), get a task done with 2^N steps, it’s horrible, Order No.6
    7. O(N!), get a task done with N! steps, it’s terrible, Order No.7

entrer la description de l'image ici

Suppose you get notation O(N^2) , not only you are clear the method takes N*N steps to accomplish a task, also you see that it’s not good as O(NlogN) from its ranking.

Please note the order at line end, just for your better understanding.There’s more than 7 notations if all possibilities considered.

In CS, the set of steps to accomplish a task is called algorithms.
In Terminology, Big O notation is used to describe the performance or complexity of an algorithm.

In addition, Big O establishes the worst-case or measure the Upper-Bound steps.
You could refer to Big-Ω (Big-Omega) for best case.

Big-Ω (Big-Omega) notation (article) | Khan Academy

  • Résumé
    “Big O” describes the algorithm’s performance and evaluates it.

    or address it formally, “Big O” classifies the algorithms and standardize the comparison process.

This is a very simplified explanation, but I hope it covers most important details.

Let’s say your algorithm dealing with the problem depends on some ‘factors’, for example let’s make it N and X.

Depending on N and X, your algorithm will require some operations, for example in the WORST case it’s 3(N^2) + log(X) operations.

Since Big-O doesn’t care too much about constant factor (aka 3), the Big-O of your algorithm is O(N^2 + log(X)) . It basically translates ‘the amount of operations your algorithm needs for the worst case scales with this’.

Definition :- Big O notation is a notation which says how a algorithm performance will perform if the data input increases.

When we talk about algorithms there are 3 important pillars Input , Output and Processing of algorithm. Big O is symbolic notation which says if the data input is increased in what rate will the performance vary of the algorithm processing.

I would encourage you to see this youtube video which explains Big O Notation in depth with code examples.

Algorithm basic pillars

So for example assume that a algorithm takes 5 records and the time required for processing the same is 27 seconds. Now if we increase the records to 10 the algorithm takes 105 seconds.

In simple words the time taken is square of the number of records. We can denote this by O(n ^ 2) . This symbolic representation is termed as Big O notation.

Now please note the units can be anything in inputs it can be bytes , bits number of records , the performance can be measured in any unit like second , minutes , days and so on. So its not the exact unit but rather the relationship.

Big O symbols

For example look at the below function “Function1” which takes a collection and does processing on the first record. Now for this function the performance will be same irrespective you put 1000 , 10000 or 100000 records. So we can denote it by O(1) .

 void Function1(List data) { ssortingng str = data[0]; } 

Now see the below function “Function2()”. In this case the processing time will increase with number of records. We can denote this algorithm performance using O(n) .

 void Function2(List data) { foreach(ssortingng str in data) { if (str == "shiv") { return; } } } 

When we see a Big O notation for any algorithm we can classify them in to three categories of performance :-

  1. Log and constant category :- Any developer would love to see their algorithm performance in this category.
  2. Linear :- Developer will not want to see algorithms in this category , until its the last option or the only option left.
  3. Exponential :- This is where we do not want to see our algorithms and a rework is needed.

So by looking at Big O notation we categorize good and bad zones for algorithms.

Bog O classification

I would recommend you to watch this 10 minutes video which discusses Big O with sample code

https://www.youtube.com/watch?v=k6kxtzICG_g

If I want to explain this to 6 years old child I will start to draw some functions f(x) = x and f(x) = x^2 for example and ask a child which function will be upper function on the top of the page. Then we will proceed with drawing and see that x^2 wins. “Who wins” actually is the function which grows faster when x tends to infinity. So “function x is in Big O of x^2” means that x grows slower than x^2 when x tends to infinity. The same can be done when x tends to 0. If we draw these two function for x from 0 to 1 x will be upper function, so “function x^2 is in Big O of x for x tends to 0”. When child will get older I add that really Big O can be a function which grows not faster but the same way as given function. Moreover constant is discarded. So 2x is in Big O of x.