Randomize File


Randomizing a file Here is a program that reads a file of words, randomizes its order, and write it back out. The key randomizing code is in red below. Here is how it works:

The first step chooses a random index from 0 to the end, and interchanges the element at this index with the 0th element. Possibly the 0th element itself way chosen, so that it remains where it was. If the RNG is good, each element of the array is equally likely to end up in the 0th slot. This 0th element is in its final position and remains there. The order of the remaining elements does not matter, since they will be randomized by the remaining steps of the algorithm.

The next step chooses a random index from 1 to the end, and interchanges the element at this index with the 1st element. Again perhaps 1 is chosen as the random index. The algorithm proceeds until a final interchange at the end is either taken or not. It can be proved that if the random number generator is perfect, then all possible rearrangements of the elements are equally likely.

Copy and Randomize a File Runs of File Randomize
#include <stdio.h>
#include <math.h>
#include <time.h>
#define ASIZE 26000

int ranint(int a, int b);
double rand();
double seed;

int main(int argc, char *argv[]) {
   int ch, i, size, loc;
   char *temp;
   char buf[40];
   char *a[ASIZE];
   FILE *infile, *outfile;
   seed = (double)time(NULL);
   infile = fopen("../wordst/words_unix_ns","r");
   outfile = fopen("ranwords_unix2", "w");
   if (infile == NULL) {
      fprintf(stderr, "Can't open %s\n", "??");
      exit(1);
   }
   i = 0;
   /* read file into array a*/
   while (fscanf(infile, "%s", buf) != EOF) {
      a[i] = (char *) malloc(strlen(buf) + 1);
      strcpy(a[i], buf);
      i++;
   }
   /* randomize file in place */
   size = i;
   for (i = 0; i < size - 1; i++) {
      loc = ranint(i, size-1);
      temp = a[i];
      a[i] = a[loc];
      a[loc] = temp;
   }
   /* output file */
   for (i = 0; i < size; i++)
      fprintf(outfile, "%s\n", a[i]);
   
}

/* unif dist random int betw a and b inclusive */
int ranint(int a, int b) {
   return (int)(rand()*(b-a+1)+a);
}

/* unif dist random double between 0 and 1 */
double rand() {
   double a = 16807.0,
          m = 2147483647.0;
   double q;
   seed = a*seed;
   q = floor(seed/m);
   seed = seed - q*m;
   return(seed/m);
}
% cc -o filecopy filecopy.c
% cat ../wordst/words_unix_ns
10th
1st
2nd
a
AAAS
Aarhus
Aaron
Ababa
aback
abacus
abalone
abandon
abase
abash
abate
abater
abbas
abbey
abbot
Abbott
abbreviate
% cc -o randomize randomize.c -lm
% randomize
% cat ranwords_unix2
AAAS
abacus
abbot
10th
Aarhus
a
abbas
abater
2nd
Abbott
abbey
abate
aback
abbreviate
Ababa
abase
1st
abash
Aaron
abandon
abalone
%