hello friends! new(ish)!

Computer science: Difference between revisions

From InstallGentoo Wiki v2
Jump to navigation Jump to search
>Gomez
>Chchjesus
(Moving some material from the Computer Science page to the Programming page, and vice versa)
Line 16: Line 16:
There is also a good introduction to many concepts explained [http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=alg_index here].
There is also a good introduction to many concepts explained [http://community.topcoder.com/tc?module=Static&d1=tutorials&d2=alg_index here].


==Competitive Programming==
==Sorting Algorithms==
A big topic in computer applications is the quick and efficient sorting of data. There are many different sorting techniques, as a programmer one has to balance the algorithm's ease of reading vs the efficiency of said code. Many standard libraries/APIs already have sorting functions already written, however it is still good practice to know some of the theory behind how the different algorithms work.


'''You are in the jungle. You have a pocket-knife.''' Someone asks you to kill a mountain lion.  
===Bubble Sort===
The bubble sort is a very simple to read algorithm that is the standard first sorting method that most programmers learn. The bubble sort gets its name because values are swapped among each other, gradually moving to the top of the list.


Anyone but a programmer would be asking ''"WTF is a MOUNTAIN lion doing in a JUNGLE?!"'', but that's not what you have been trained to do as a programmer. You are here to solve problems, not to question them.
:<code>//bubble sort example
#include <stdio.h>
#define SIZE 10
:int main( void ){
::int a[SIZE] = {2,6,9,1,4,15,7,2}; //array to be sorted
::int count; //to be used to count passes
::int i; //comparisons counter
::int hold; //temp variable to hold numbers
:
:://start of bubble sort
::for(count=0; count<SIZE-1; count++){//loop to control number of passes
:://size is decreased by 1 to prevent array out of bounds error
:
:::for(i=0; i<SIZE-1; i++){//comparison loop
::::if(a[i]>a[i+1]){//if a[i] bigger than a[i+1], swap values
:::::hold = a[i];
:::::a[i] = a[i+1];
:::::a[i+1] = hold;
::::}//end if
:::}//end inner for
:
::}//end outer for
:    
:://print array
::for(i=0; i<SIZE; i++)
:::printf("%4d", a[i]);
::return 0;
:}</code>


Years of training has taught you well. You use your knife to sharpen a stick. You cut vines to lash sharp stones on one end. Maybe you're from a top university, and you've learned to extract essential ingredients from plant and insect life around you to fashion a poison to tip your weapon with.
===Selection Sort===
A bit further up on the food chain is the selection sort. The selection sort goes through each element position and finds the value which should occupy the next position in the sorted array. When it finds the appropriate element, the algorithm exchanges it with the valuewhich previously occupied the desired position. On the second pass it looks for the second smallest element and swaps it with the second position and so on.


Convinced that you have an effective and efficient way to kill the lion, you set forth to accomplish your task. Maybe your stick is too short, or your poisons don't work. It's okay - you live to refine your method and try again another day.
:<code>//Selection sort example
:// finds smallest element between left hand and right hand
:// moves element to correct position
:#include <stdio.h>
:void selectionSort(int array[], int n){
://n is length
:int lh, rh, i, temp;
:
:for (lh=0; lh < n; lh++){//everything left of lh is considered sorted
::rh = lh; //reset right hand
:://runs through lh-rh bounds of array
::for (i = lh + 1; i<n; i++){
::://if lowest element found, copy index to right hand
:::if (array[i] < array[rh]) rh = i;
::}
:://move lowest element to correct position
::temp = array[lh];
::array[lh] = array[rh];
::array[rh] = temp;
::}
:}
:int main( void ){
::int len = 8;
::int a[8] = {2,6,9,1,4,15,7,2}; //array to be sorted
::int i;
:
::selectionSort(a, len);
:://print array
::for(i=0; i<len; i++)
::printf("%4d", a[i]);
::return 0;
:}</code>


Then someone figures out a way to fashion a low-grade explosive from harvesting chemicals in the jungle. Your method of fashioning a spear to kill the lion is now far from the best way to accomplish your task. Nevertheless, it's still a simple way, and will continue to be taught in schools. Every lion-killer will be taught how to build his tools from scratch.
===Merge sort===
    // This is an iterative version of merge sort.
    // It merges pairs of two consecutive lists one after another.
    // After scanning the whole array to do the above job,
    // it goes to the next stage. Variable k controls the size
    // of lists to be merged. k doubles each time the main loop
    // is executed.
    #include <stdio.h>
    #include <math.h>
    int i,n,t,k;
    int a[100000],b[100000];
    int merge(l,r,u)
      int l,r,u;
    { int i,j,k;
      i=l; j=r; k=l;
      while (i<r && j<u) {
        if (a[i]<=a[j]) {b[k]=a[i]; i++;}
        else {b[k]=a[j]; j++;}
        k++;
      }
      while (i<r) {
        b[k]=a[i]; i++; k++;
      }
      while (j<u) {
        b[k]=a[j]; j++; k++;
      }
      for (k=l; k<u; k++) {
        a[k]=b[k];
      }
    }
    sort()
    { int k,u;
      k=1;
      while (k<n) {
        i=1;
        while (i+k<=n) {
          u=i+k*2;
          if (u>n) u=n+1;
          merge(i,i+k,u);
          i=i+k*2;
        }
        k=k*2;
      }
    }
    main()
    { printf("input size \n");
      scanf("%d",&n);
    /*  for (i=1;i<=n;i++) scanf("%d",&a[i]); */
      for (i=1;i<=n;i++) a[i]=random()%1000;
      t=clock();
      sort();
      for (i=1;i<=10;i++) printf("%d ",a[i]);
      printf("\n");
      printf("time= %d millisec\n",(clock()-t)/1000);
    }


That's "real-life" programming.
===Performance===


In competitive programming, you start out with the same resources (a pocket-knife), except '''you have 2 minutes to kill the lion.'''
==Search Algorithms==
Think of an ordered phone book. Consider two ways to search it for a name: linear search and binary search.  


As a beginner, you will stare at the lion and do nothing.
Linear search consists of starting at page 1, at the first name, and checking every name, one at a time, in order, until you find the name. Considering a phone book of one million names, this is very useful if you're searching for the number of Aaron A. Aardvark, but could be tedious if looking for Matthew Maloney, or downright awful if looking for a person named Z. Zerthis.


Soon, you learn that if you kill a squirrel, sometimes the judge thinks it's a lion and you're good to go.
Binary search consists of opening the book up to half way, and comparing the name you are searching for with the items on either side. If the name you are looking for is earlier on, then you'll want the left half. If it is later on, you'll want the right half. What you have done here, is effectively cut the amount of names to search in half. You then go to the middle of whichever half you chose, and repeat the process until you've found what you're looking for. This is what we humans naturally do in a general sense.


A more experienced programmer just keeps stabbing the lion and hopes that the lion dies in time. Soon, you learn that there are certain spots on a lion that are damage immune. You learn to not even bother stabbing those spots. Sometimes, the lion doesn't expose those spots, so you get really good at killing squirrels.
To really get a hold on how much better Binary search is compared to linear search, consider a name that is roughly 3 quarters to the end of the phone book. With 1 million names, a linear search would take you roughly 750,000 comparisons before you found that name. With binary search, it would take roughly 3 or 4 comparisons.  


And then, to be a great competitive programmer, you need to be able to do two things.
There are also two string searching/pattern matching algorithms known as the Knuth-Morris-Pratt (KMP) and the Boyer-Moore (BM) algorithms.


First, you must learn how to find the lion's critical point and kill it in one swift stroke.
Computers use graph searching algorithms called the Floyd-Warshall algorithm, and Dijkstra's (pronounced Daik-stra) algorithm, to find the optimal path between two or more nodes connected by pathes on a graph.


Second, you must learn how to be so handy with your knife that you can fashion a sharp stick in 1 minute, and spend the next minute stabbing the lion to death.
==Recursion==
Recursion is the concept of something entering inside itself, or other a new ''instance'' of itself. Functional programming languages such as Lisp dialects are known to employ this often.


But never ever will you be able to have enough time to fashion an explosive to take the lion out.
Recursion can be tough to get your head around at first, because for a lot of people it is a new concept. However, once understood, it's quite simple. The main things to remember are the '''base case''', the '''terminating case''', and the concept of an '''accumulator'''.


===Material===
Base case: This is the starting step of recursion.


====Books====
Termination case: This tells the recursion when to stop. Without this, it would simply recur forever, or at least until the machine runs out of memory.


There are books that teach programming competition. They are a very good resource to learn computer Science
Accumulator: This is how you can pass data through a function recursively. It's called an accumulator because it ''accumulates'' data, or collects it as it goes. Here's an example of a program in Python that recursively adds numbers from 0 to 10 into a list, and then returns it.


* http://www.comp.nus.edu.sg/~stevenha/myteaching/competitive_programming/cp1.pdf
The base case will be starting at 0. The terminating case is at 10. We'll use a list as an accumulator.
** https://sites.google.com/site/stevenhalim/
* (A less good one)
** http://acm.cs.buap.mx/downloads/Programming_Challenges.pdf


====Compete====
Lets define our function:


There are many pages to compete in online programming competitions. They often offer tutorials and many other more:
    def one_to_ten(number, accumulator):
        if number > 10:
            return accumulator
        else:
            accumulator.append(number)
            return one_to_ten(number + 1, accumulator)
   
    empty_list = []
    base_case = 0
   
    result = one_to_ten(base_case, empty_list)
   
    print(result) # result is [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


* [http://tocpoder.com/tc TopCoder]
Going through it step by step:
* [http://codeforces.com/ Codeforces]
* [http://www.codechef.com/ Codechef]
* [https://code.google.com/codejam/ Google CodeJam]
* [https://www.facebook.com/hackercup Facebook HackerCup] (I do not reccomend this one)
* [http://icpc.baylor.edu/worldfinals/results ICPC] (Might be available in your college)
* [https://en.wikipedia.org/wiki/International_Olympiad_in_Informatics IOI] If you are underage and autistic, look for the local version in your country of this one.


====Training sites====
  1. Our function takes a number and the accumulator
  2. Check that the termination case hasn't been met
  3. If it has, don't call the function anymore. Just return the accumulator.
  4. Otherwise, put the number in the accumulator
  5. Call the function from within itself, but with number + 1, so the next recursion of it
    will have the next number
 
So, the function enters a second instance of itself, but with number plus one, and it does this 10 times, until number is 10 (which will be 10 levels deep). However, when number becomes 11, it will not recur anymore, and it will return the accumulator back to the 9th function, which will also finish and return to the 8th function, which returns back to the 7th function, and so on until the first call of the function has returned. You could say it naturally ''unrolls'' itself.
 
==Linked Lists==
[https://paste.installgentoo.com/view/b2ff8583 Linked List Example]
 
==Binary Search Trees==
 
A binary search tree is a way to very quickly search sorted data. It is simply constructing a tree by following a binary search.
 
    /* This program repeatedly inserts n random numbers into
      a binary search tree. Then it traverse the tree
      in in-order to store the keys into array a.
      After the traversal, sorted numbers are in array a. */
    #include <stdio.h>
    #include <math.h>
    #include <stdlib.h>
    #define nil 0
    int k=1;
    struct item *proot;
    struct item{
      int key;
      struct item *adr;
      struct item *left;
      struct item *right;
    };
    struct item a[1000000];
    void insert(int x){
      struct item *p, *q;
      p=proot;
      do{
        q=p;
        if (x<=(*p).key)p=(*p).left; else p=(*p).right;
      }
      while (p!=nil);
      p=malloc(sizeof(struct item));
      (*p).key=x; (*p).left=nil; (*p).right=nil;
      (*p).adr=p;
      if (x<=(*q).key) (*q).left=p; else (*q).right=p;
    }
    traverse(struct item *proot)
    { struct item *p;
      p=proot; 
      if (p!=nil){
        printf("(");
        traverse((*p).left);
        printf("%d", (*p).key);
        a[k]=(*p); k++;
        traverse((*p).right);
        printf(")");
      }
    }
    main()
    {
      int i,n,t;
      scanf("%d",&n);
      proot=malloc(sizeof(struct item));
      (*proot).key=99999;
      (*proot).left=nil;
      (*proot).right=nil;
      for(i=1;i<=n;i++) a[i].key=random()%100000;
    //  for (i=1;i<=n;i++)printf("%d ",a[i].key);
      printf("\n");
      t=clock();
      for (i=1;i<=n;i++)insert(a[i].key);
      traverse(proot->left);
      printf("\n");
      printf("time= %d millisec\n\n", (clock()-t)/1000);
    /*
      for (i=1;i<=n;i++)
        printf("%3d %3d %3d %3d\n\n",(int)(a[i].adr) % 1000, a[i].key,
                                (int)(a[i].left) % 1000, (int)(a[i].right) % 1000);
    */
    }
 
==Hash Tables==


If you want to train:


* [http://uva.onlinejudge.org/ UVA]
* [http://usaco.org/ USACO]
* [http://ahmed-aly.com/ Ahmed Aly's Webpage]
* [https://projecteuler.net/ Project Euler]


[[Category:Programming]]
[[Category:Programming]]

Revision as of 09:02, 29 January 2014

You do need to know how to code before starting with this.

If you are reading this you are probably a neet. Computer Science is something really nice, but has little practical applications unless you have finished college and started doing something really interesting. If you want to learn how to do something useful, you have to learn how to program, not how to do Computer Science. How to be useful is not discussed here; however, if you are autistically good at it, you can be instantly hired by Google, Microsoft or facebook.

There are books about computer science at http://books.gentoomen.org/. Buy phisical ones from http://abebooks.com


Algorithms

An algorithm is any well-defined computational procedure that takes some value, or set of values, as input and produces some value, or set of values as output. It is the part that does the magic.

There are a few programming concepts discussed in the Programming concepts page.

For the rest, check Introduction to Algorithms (by Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein), and if you have the balls and the time, check The Art of Computer Programming (by Donald D. Knuth).

There is also a good introduction to many concepts explained here.

Sorting Algorithms

A big topic in computer applications is the quick and efficient sorting of data. There are many different sorting techniques, as a programmer one has to balance the algorithm's ease of reading vs the efficiency of said code. Many standard libraries/APIs already have sorting functions already written, however it is still good practice to know some of the theory behind how the different algorithms work.

Bubble Sort

The bubble sort is a very simple to read algorithm that is the standard first sorting method that most programmers learn. The bubble sort gets its name because values are swapped among each other, gradually moving to the top of the list.

//bubble sort example
#include <stdio.h>
#define SIZE 10
int main( void ){
int a[SIZE] = {2,6,9,1,4,15,7,2}; //array to be sorted
int count; //to be used to count passes
int i; //comparisons counter
int hold; //temp variable to hold numbers
//start of bubble sort
for(count=0; count<SIZE-1; count++){//loop to control number of passes
//size is decreased by 1 to prevent array out of bounds error
for(i=0; i<SIZE-1; i++){//comparison loop
if(a[i]>a[i+1]){//if a[i] bigger than a[i+1], swap values
hold = a[i];
a[i] = a[i+1];
a[i+1] = hold;
}//end if
}//end inner for
}//end outer for
//print array
for(i=0; i<SIZE; i++)
printf("%4d", a[i]);
return 0;
}

Selection Sort

A bit further up on the food chain is the selection sort. The selection sort goes through each element position and finds the value which should occupy the next position in the sorted array. When it finds the appropriate element, the algorithm exchanges it with the valuewhich previously occupied the desired position. On the second pass it looks for the second smallest element and swaps it with the second position and so on.

//Selection sort example
// finds smallest element between left hand and right hand
// moves element to correct position
  1. include <stdio.h>
void selectionSort(int array[], int n){
//n is length
int lh, rh, i, temp;
for (lh=0; lh < n; lh++){//everything left of lh is considered sorted
rh = lh; //reset right hand
//runs through lh-rh bounds of array
for (i = lh + 1; i<n; i++){
//if lowest element found, copy index to right hand
if (array[i] < array[rh]) rh = i;
}
//move lowest element to correct position
temp = array[lh];
array[lh] = array[rh];
array[rh] = temp;
}
}
int main( void ){
int len = 8;
int a[8] = {2,6,9,1,4,15,7,2}; //array to be sorted
int i;
selectionSort(a, len);
//print array
for(i=0; i<len; i++)
printf("%4d", a[i]);
return 0;
}

Merge sort

   // This is an iterative version of merge sort.
   // It merges pairs of two consecutive lists one after another.
   // After scanning the whole array to do the above job,
   // it goes to the next stage. Variable k controls the size
   // of lists to be merged. k doubles each time the main loop
   // is executed.
   #include <stdio.h>
   #include <math.h>
   int i,n,t,k;
   int a[100000],b[100000];
   int merge(l,r,u)
     int l,r,u;
   { int i,j,k;
     i=l; j=r; k=l;
     while (i<r && j<u) { 
       if (a[i]<=a[j]) {b[k]=a[i]; i++;} 
       else {b[k]=a[j]; j++;}
       k++;
     }
     while (i<r) { 
       b[k]=a[i]; i++; k++;
     }
     while (j<u) { 
       b[k]=a[j]; j++; k++;
     }
     for (k=l; k<u; k++) { 
       a[k]=b[k]; 
     }
   }
   sort()
   { int k,u;
     k=1;
     while (k<n) {
       i=1;
       while (i+k<=n) {
         u=i+k*2;
         if (u>n) u=n+1;
         merge(i,i+k,u);
         i=i+k*2;
       }
       k=k*2;
     }
   }
   main()
   { printf("input size \n");
     scanf("%d",&n); 
   /*  for (i=1;i<=n;i++) scanf("%d",&a[i]); */
     for (i=1;i<=n;i++) a[i]=random()%1000;
     t=clock();
     sort();
     for (i=1;i<=10;i++) printf("%d ",a[i]);
     printf("\n");
     printf("time= %d millisec\n",(clock()-t)/1000);
   }

Performance

Search Algorithms

Think of an ordered phone book. Consider two ways to search it for a name: linear search and binary search.

Linear search consists of starting at page 1, at the first name, and checking every name, one at a time, in order, until you find the name. Considering a phone book of one million names, this is very useful if you're searching for the number of Aaron A. Aardvark, but could be tedious if looking for Matthew Maloney, or downright awful if looking for a person named Z. Zerthis.

Binary search consists of opening the book up to half way, and comparing the name you are searching for with the items on either side. If the name you are looking for is earlier on, then you'll want the left half. If it is later on, you'll want the right half. What you have done here, is effectively cut the amount of names to search in half. You then go to the middle of whichever half you chose, and repeat the process until you've found what you're looking for. This is what we humans naturally do in a general sense.

To really get a hold on how much better Binary search is compared to linear search, consider a name that is roughly 3 quarters to the end of the phone book. With 1 million names, a linear search would take you roughly 750,000 comparisons before you found that name. With binary search, it would take roughly 3 or 4 comparisons.

There are also two string searching/pattern matching algorithms known as the Knuth-Morris-Pratt (KMP) and the Boyer-Moore (BM) algorithms.

Computers use graph searching algorithms called the Floyd-Warshall algorithm, and Dijkstra's (pronounced Daik-stra) algorithm, to find the optimal path between two or more nodes connected by pathes on a graph.

Recursion

Recursion is the concept of something entering inside itself, or other a new instance of itself. Functional programming languages such as Lisp dialects are known to employ this often.

Recursion can be tough to get your head around at first, because for a lot of people it is a new concept. However, once understood, it's quite simple. The main things to remember are the base case, the terminating case, and the concept of an accumulator.

Base case: This is the starting step of recursion.

Termination case: This tells the recursion when to stop. Without this, it would simply recur forever, or at least until the machine runs out of memory.

Accumulator: This is how you can pass data through a function recursively. It's called an accumulator because it accumulates data, or collects it as it goes. Here's an example of a program in Python that recursively adds numbers from 0 to 10 into a list, and then returns it.

The base case will be starting at 0. The terminating case is at 10. We'll use a list as an accumulator.

Lets define our function:

   def one_to_ten(number, accumulator):
       if number > 10:
           return accumulator
       else:
           accumulator.append(number)
           return one_to_ten(number + 1, accumulator)
   
   empty_list = []
   base_case = 0
   
   result = one_to_ten(base_case, empty_list)
   
   print(result) # result is [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Going through it step by step:

 1. Our function takes a number and the accumulator
 2. Check that the termination case hasn't been met
 3. If it has, don't call the function anymore. Just return the accumulator.
 4. Otherwise, put the number in the accumulator
 5. Call the function from within itself, but with number + 1, so the next recursion of it
    will have the next number

So, the function enters a second instance of itself, but with number plus one, and it does this 10 times, until number is 10 (which will be 10 levels deep). However, when number becomes 11, it will not recur anymore, and it will return the accumulator back to the 9th function, which will also finish and return to the 8th function, which returns back to the 7th function, and so on until the first call of the function has returned. You could say it naturally unrolls itself.

Linked Lists

Linked List Example

Binary Search Trees

A binary search tree is a way to very quickly search sorted data. It is simply constructing a tree by following a binary search.

   /* This program repeatedly inserts n random numbers into
      a binary search tree. Then it traverse the tree
      in in-order to store the keys into array a.
      After the traversal, sorted numbers are in array a. */
   #include <stdio.h>
   #include <math.h>
   #include <stdlib.h>
   #define nil 0
   int k=1;
   struct item *proot;
   struct item{
     int key;
     struct item *adr;
     struct item *left;
     struct item *right;
   };
   struct item a[1000000];
   void insert(int x){
     struct item *p, *q;
     p=proot;
     do{
       q=p;
       if (x<=(*p).key)p=(*p).left; else p=(*p).right;
     }
     while (p!=nil);
     p=malloc(sizeof(struct item)); 
     (*p).key=x; (*p).left=nil; (*p).right=nil;
     (*p).adr=p;
     if (x<=(*q).key) (*q).left=p; else (*q).right=p;
   }
   traverse(struct item *proot)
   { struct item *p;
     p=proot;  
     if (p!=nil){
       printf("("); 
       traverse((*p).left);
       printf("%d", (*p).key);
       a[k]=(*p); k++;
       traverse((*p).right);
       printf(")"); 
     }
   }
   main()
   { 
     int i,n,t;
     scanf("%d",&n);
     proot=malloc(sizeof(struct item));
     (*proot).key=99999;
     (*proot).left=nil;
     (*proot).right=nil;
     for(i=1;i<=n;i++) a[i].key=random()%100000;
   //  for (i=1;i<=n;i++)printf("%d ",a[i].key);
     printf("\n");
     t=clock();
     for (i=1;i<=n;i++)insert(a[i].key);
     traverse(proot->left);
     printf("\n");
     printf("time= %d millisec\n\n", (clock()-t)/1000);
   /*
     for (i=1;i<=n;i++)
       printf("%3d %3d %3d %3d\n\n",(int)(a[i].adr) % 1000, a[i].key, 
                                (int)(a[i].left) % 1000, (int)(a[i].right) % 1000);
   */
   }

Hash Tables