Arrays, Pointers, Strings and Structs

Arrays

Multiple values of the same data type can be hold in an array. To access a certain element of the array, it requires the array name and the index. The index is appended in a square bracket to the variable name. Whether a variable is an array, is defined in the variable declaration. The naming convention is the same as for basic data types. To identify a variable as an array the size of the array is appended, e.g.

int values[12];

defines an integer array with twelve element. Note that the index always start with 0, so that the last array element has the index 11 in this case. Only integer values are allowed for the index.

It is of importance that an array as a whole can never appear on the left and right hand side of an assignmet operation. Only array-elements are allowed. As an example the following code fills the array with the sums from 1 to the current index+1:

int sum[100],i; sum[0]=1; for (i=1;i<100;i++){ sum[i]=(i+1)+sum[i-1]; }

Note that the index i is running from 1 to 99. A value of 0 would result in an index value of -1 in the right hand side of the last asignment. Some compilers catch the error when an index exceeds the boundary of an array, but you cannot rely on it. An index, which runs out of bounds can corrupt the content of other variables as well as result in a crash of the program. Note that you can define arrays and simple variables in the same declaration line.

Pointers

Pointers are the most powerful variables in the C language but also the most difficult to handle. Technically a pointer holds the address of a specific chunck of memory. The size of the memory depends on the data type, the pointer represents.

The declaration of an pointer is similar to an array, but with an asterix in front of the variable name instead of the sqare bracket at the end, e.g.

int a,b,*c;

The variables a and b are integer variables, while c points to one (holding the memory address of an integer variable). You can regard it as a reference to an value and not the value itself. It is highly dangerous to assign an integer value directly to a pointer except for zero. A zero value indicates a null pointer, pointing to no memory at all. Pointers should get the value (the memory address) by a reference operation. The reference operator is the ampersand symble & before the name of an variable. Similar the dereference operator of a pointer (to access the value of the memory segment, the pointer is pointing to) is the asterix-symbol.

Example (with the definition of the variables above):

a =10; c=&a; a=5; b=*c; *c=1; c=&b;

The code behavase as following. The value 10 is asigned to variable a. The pointer c is set to the memory space of variable a, using the reference operator on a (a has the value 10, &a the address in memory, which depends on the compiler and the state of the computer, when the program is started). In the third line 5 is assigned to variable a and stored at the location, where the pointer c points to. Line 4 uses the dereference operator to access the value. Because it is the value of a, b gets asigned the same value, but not the same address space. The program makes a copy of the address space of a and stores it at the location of b. The dereference operator can be used also on the left hand sign of an assignment. Here (line 5) the value 1 is stored at the location, where c points to. Thus the variable a, which uses the same memory space holds now the same value. Finally the pointer is directed to the memory location of b.

The power of pointers comes with the possibility to grap a specific chunc of memory during run time. Note that the size of an array is defined during compilation time, and that the program is useless, when the user needs a larger space (e.g. the array holds the array holds the energy for 1000 particlles, but the user wants to run the simulation with 2000 particles). Thanks to the similarity of arrays and pointers, the problem can be solved. what is needed are functions to allocate and free pieces of memory. These are very low level functions, but not very difficult to handle. The function malloc(n) returns a free piece of memory of size of n bytes and can be on the right hand side of an assignmet operation for a pointer. The function to release the memory is free(p), where p is the pointer, which previously pointed at the memory. The following example allocates the memory to hold 2000 integer values, fills each element with 1 to 2000 and then release the memory.

int *p,i; p=malloc(2000*sizeof(int)); for (i=0;i<2000;i++){ *(p+i)=i+1; } free(p);

The example requires some explanaition. The function sizeof returns the required memory size for a basic data type, here an integer. Typically an integer is 4 bytes long. So malloc(2000) would only reserve 2000 bytes, space enough to hold only 500 integers. Incrementing a pointer by one does not advance the pointer by one byte but by the size of the basic data type. If in this case an integer is 4 bytes long, p+1 points 4 bytes behind the position p points at. Note that the increment operator is applied before the derefernece operator. The slightly different operation *p+1 is only allowed on the right hand side and increments the first integer value in the memory chunck by one. The function free(p) has a return type void, which means it does not return anything at all. Thus it can stand alone without being chained to an asignment operation.

Strings

Strings are the special cases of pointers to or arrays of characters (data type char). Each character is one byte long and a string (not the allocated memory) is terminated with a zero value character. Working with strings is mostly done with a wide sets of library function of the standard C library. You can initialize strings in the declaration by assigning them a literal constant (written out text between quotation marks). The initialization works for pointers and arrays, with the difference that for pointers only the space to hold the assigned string constant is allocated.

Here an example with strings.

char *s1="A string"; char s2[20]="Another string"; char s3[40]; strcpy(s3,s1); s2[7]=0;

Note that s1 allocates 8 bytes (the text and the termination byte), while s2 and s3 are 21 and 41 bytes long, respectively. The function strcpy copies the content of s1 to s3. Note that the memory of the target (the first argument) must be larger than the source. The last line is a low level operation. By writing the 0 value (not the '0' character) to the 8th character position the string is now terminated there and only holds "Another". Technically the cut off content is not lost and the operation s2[7]=' ' would restore the original string.

Structs

Structs are derived data types, which holds a whole set of basic data types, array, pointers or other derived data types. The definition of the derived data type has to be outside any function. An example is

typedef struct vector{ double x; double y; double z; } vector;

for defining a vector. The keywords typedef struct are mandatory followed by the name of the struct. The curly bracket holds all the members types, which can be basic types, arrays, pointers and already defined derived data types, as well as pointers to the data type derived here.

Once define, a variable can be declared for this data type. The individual members can be accessed with appending the member name to the variable name, separated with a period. Pointers also allow member specific dereference operation, which is replacing the period with the right arrow ->.

Example, using the vector definition of above.

double a; vector r, *p; a=1.23; r.x=a; r.y=0; r.z=0; p=&r; p->y=-1.e-2;

Note that the last line can be also written as (*p).y=-1.e-2; The power of structs are in linked list, which contains pointers of the derived data type as its own members. They can be used to dynamically build up data set, which are lined as a ring, tree or other geometry (e.g. to hold the content of directories and files in memory).

previous:next