Gillius's Programming

String output and input works much the same in C as for other variables. In C++, cout and cin are made to work with char * parameters the way a programmer would expect them to work. Displaying a string in C and C++ works exactly the same way it did in the Output I lession.

#include <cstdio>
#include <iostream>

using namespace std;

void main() {
  char MyName[] = "Jason";
  printf("%s\n", MyName());
  cout << MyName << endl;
}

Nothing much is new or surprising in this program except for the fact there was no number in the declaration. This is not a bug, but since we are working with the string, we can use the exact size needed. When the array is initalized, and the size is left blank, the array becomes the exact size of the elements. In this case there are 5 characters in the name, so the size would be 6 characters, 5 for the name, and one for the NULL character at the end, which is implied when using the double quotes.

Input for strings looks a little funky for C and C++. C and C++ can input strings like this:

char str1[500] = "";
cin >> str1;        //C++ "streams" method
scanf("%s", str1);  //C "stdio" method

//C++ style (preferred):
string cppStr;
cin >> cppStr;

(Note: the C++ version using the string class is preferred because you don't need to guess the size of the string before reading it in, which is safer from program crashes and buffer overruns).

However, the problem with this method is that only one word of the string will be picked up. Remember how using a space denotes multiple variable entries? The computer will think entering the string "My name is Jason" is actually entring 4 string variables. If all the data is not picked up by the cin/scanf statement, the remaning data will remain in the input buffer until it is read. To read an entire line until the user presses enter, use the following commands:

/* C Code */
char str1[500] = "";
char str2[500] = "";

scanf("%499[^\n]", str1);
scanf("\n%499[^\n]", str2);
/* C++ Code */
#include <iostream>
#include <string>

using namespace std;

int main() {
  string str1;
  getline( cin, str1 );
  return 0;
}

Scanf changes to place an end-of-line character before other strings. This is because when the string is read into the buffer, the end-of-line is not inserted into the string, and this character stays in the buffer waiting to be read. Whenever a new string needs to be read, the end-of-line needs to be read. It cannot be read after the first string is read, because some non-whitespace character needs to be in the buffer to read anything at all. As before, the C++ version is preferred as the string class and getline takes care of these issues for you.

The brackets in scanf are the included characters, or in this case, excluded characters because the list begins with a carat, ^. The end-of-line character is not desired to be in the string, so this character is excluded from input.


This section is C++ specific, the C equalvalent will be discussed in the pointers section.

Sometimes functions need to change the variables passed to them, or return more than one variable. Normally, when parameters are passed to a function, they are copied, and when the function returns the varaibles are unchanged. In order to change the variables, they need to be passed by reference, using the & in front of the variable name in the function header:

void Square(float &x) {
  x *= x;
}

Keep in mind that of course, this function will not work for constants, for example Square(5) will not work since 5, the constant, cannot be changed. Calling a function with reference parameters requires no changes.


Structures, called records in other langages, are simply just a large user-created variable type. As with other languages, fields need to be defined and they are accesed through the dot operator. Below is a simple program that uses structures:

#include <iostream>
#include <string>

using namespace std;

struct Person {
  string name;
  int age;
};

int main() {
  Person Bob;

  cout << "Enter first name: ";
  getline( cin, Bob.Name );
  cout << "Enter " << Bob.Name << "'s age: "
  cin >> Bob.age;
  cout << Bob.Name << " is " << Bob.age << " years old." << endl;

  return 0;
}

Note: this syntax declaration will work in C++, but C programs would write the struct in the format below and would use the C-style input methods.

typedef struct {
  char Name[500];
  int age;
} Person;

Both these formats mean the same thing. Notice the semi-colon terminating the declaration. Keep in mind that all structures in C/C++ can be nested. The next program segment shows how complex this can get:

struct Arm {
  int fingers;
  float length;
};

struct Leg {
  int toes;
  float length;
};

struct Body {
  Arm arms[2]; //An array of structs in a struct
  Leg legs[2];
  float height;
};

struct Person {
  string name;
  int age;
  Body body; //Remember C++ is case senstive so
             //"body" and "Body" are different
};

int main() {
  Person Townspeople[1500];
  Townspeople[50].name = "Bob";
  Townspeople[50].body.arms[1].fingers = 5;
    //How much nesting is too much nesting!?
	return 0;
}

This program shows how complex structs can get. For those programming C++, remember the structure of this example, since this is a good example of "Object Oriented Programming" or OOP. This is the most widely used approach since it is probably the easiest approach to understand, as each object in "reality" is modeled, using a struct, or in C++, a class. OOP in C is accomplished in C by using structures, and making functions that work on them, like:

void CutOffFinger(Person &victim, int arm) {
  victim.body.arms[arm].fingers--;
}

This approach will work in C++ of course, but in a much more advanced and intuitive manner using the dot operator like Person.CutOffFinger(int arm) used in classes will be discussed in the C++ specific tutorial. Also, notice that victim is passed by reference since the struct changes.

There is also a shortcut method to initializing structs that looks exactly like the array initalization. Simply initalize each element in the struct in order in braces like this:

struct Person {
  string name;
  int age;
};

Person Bill = {"Bill", 56};

Constants can be used in place of variables when their value will never change. This will allow for faster execution and less memory space, and maintains flexabity for the programmer in case the program needs to be changed. To declare a constant, simply place the word const in front of a variable.

const int MaxPlayers = 5; //Game constant
const float PI = 3.14159; //Math constant

struct Coorinate {
  int x, y;
};

const Coorinate Start = {50, 60};

Good canidates for constants are shown above, and include math constants, maximums and minimums, screen size and color depth -- anything that the user can't change or the program modify.


Before moving onto the pointers section, an understanding of the numbering systems used in computers will be needed, and a quick review is in order. If you already understand binary, decimal, and hexadecimal as well, then you may skip this section.

Humans are used to counting with base 10 numbers, digits 0-9. To find the number of decimal numbers and any other base numbering system, use this formula (the ^ means exponent):

digit * (base ^ (place-1)) + . . . for each digit

For example to find the value of 54321, the formula would be like this:

5*(10^4) + 4*(10^3) + 3*(10^2) + 2*(10^1) + 1*(10^0) = 54321

The binary system uses 2 digits 0-1. The formula works the same way, for example the number 100 in decimal is:

1*(2^2) + 0*(2^1) + 0*(2^0) = 4

The quickest way to count binary numbers is to add that places value to the total if the place is "on" or a 1. A lot of times programmers seperate every 4 digits with a colon for readability. The place values start at 1 and go in powers of 2: 1,2,4,8,16,32,64,128,256, like this:

Value: 128 64  32  16  8   4   2   1
Number:  1  0   0   1: 1   1   0   0
Tally: 128 +0  +0 +16 +8  +4  +0  +0 = 156

The total number of combonations (capacity) of a binary number or variable is 2^numbits-1. For example a byte can hold 255 combinations, or 2^8-1. Below is a chart for the variable ranges for most current compilers. Please note that the C/C++ standards do not define specific sizes for these variables, except the that the size of a char is always 1 byte:

Variable Names Number of bits Unsigned Capacity Signed Capacity
byte, char 8 255 -128..127
short 16 65535 -32768..32767
int 32 4294967295 -2147483648..2147483647
long long 64 2^64-1 -2^63..2^63-1

The hexadecimal system was chosen to be used in the computer field because of its relation to binary. Each 2 digits in the hexadecimal system corresponds with one byte, and each letter to a set of 4 bits. Digits go from 0..F, where A=10. The formula shown above with the powers holds true to convert to the decimal system. Converting to binary is more of a memorization. The 0x in front of the numbers simply designates that it is in the hexadecimal system.

0xFF = 255 = 1111:1111
0xF = 15 = 1111
0x80 = 128 = 1000:0000
0x8 = 8 = 1000
0x0 = 0 = 0000

As you can see you can memorize (or easily figure out) the 15 combonations and place the parts together. A quicker way in Windows is to use the calculator, which can perform these conversions in scientific mode.


Each variable in memory is stored in a location in memory, and usually the variables are contingous when declared together in a function. Mainpulating very large objects, arrays, or dynamically allocated variables (section below), arrays, and linklists require more detailed understanding and use of how variables actually work in the compiler. In C, this is the only way to change variables which need to be changed in the function.

A pointer varaible is denoted by an asterisk following a variable type and contains a memory location pointing to a variable of the type. This can be written two different ways, with the * right after the type, or the * before the name. To return a pointer to a variable, place an & in front of its name. To reverse this operation, place an * in front of the pointer.

#include <iostream>
using namespace std;

int main() {
  int x       = 53;
  int *intptr = &x;
  int* ptr2   = intptr; //This syntax works too
	
  cout << "The value of x is: " << x << " or " << *intptr << endl;
  cout << "The memory address of x is: " << intptr << " or " << &x << endl;
  return 0;
}

The first cout will output the same variable twice. Keep in mind that x and *intptr is the SAME data, they are NOT copied -- this is the point of pointer variables is to have two names to the same data, or to mark the beginning of an array. To change the data a pointer points to, simply dereference it using the *, and keep in mind that x will change as well:

*intptr = 57; /*X becomes 57 as well since intptr points to data in x*/

An entire array cannot be stored under one variable, so when an array is declared, a block of contingous memory is allocated, and the [] operator works on the pointer. The formula for determining element x in an array which is used internally by the compiler is: ArrayAddress + x*sizeof(ArrayType), where ArrayAdress is a pointer to ArrayType. The sizeof function is a real function which returns the number of bytes a variable occupies. If an int array was declared, the sizeof(int) would equal 4. While programming however, the addition works correctly on pointers -- adding 1 to an int pointer will actually add 4 bytes ahead, from 0x0 to 0x4.

int IntArray[500];            //Assume IntArray == 0x0
cout << &IntArray[0] << endl; //Outputs 0x0, the first element's address
cout << &IntArray[1] << endl; //Outputs 0x4, since the second var is 4
                              // more bytes into memory.
cout << IntArray << endl;     //Outputs 0x0, the first element's address
cout << IntArray + 200        //Outputs address of element 200, 0x320
cout << &IntArray[200] << endl;//Same as above

Now the reason why array elements start at 0 become clear -- adding 0 to the array address returns the first element. Also a very important note: an array will never truly start at 0x0, which is the address of the NULL pointer. This area of memory points to the start of conventional memory(DOS) or to the start of your program or segment, depending on processor and operating system. In any case, NULL, address 0, is not a valid accessible address in almost all operating systems, and luckily, in Windows this will generate a GPF (general protection fault) and terminate the program with that lovely "Illegal Operation" box. I know what you are thinking: no -- it's not really helpful to programmers either as we just start a debugger if we want information on the crash.

You can access an array entirely through pointers like so:

int array[500] = {0}; //intialize all elements to 0
int *traverse = array;//To traverse the array
for (int c=0; c<500; c++) //for all elements
  *(traverse++) = c;  //Set array elements. The ++ operator AFTER the pointer
                      // increments it AFTER this statment, moving to next
                      // element.  If the ++'s were before the varaible, then
                      // it would increment before the access, and would skip
                      // first element, and copy one element over its bounds.

Accessing unitialized pointers in DOS or any environment that does not check for NULL dereferencing may not immediately crash the program. As DOS programming has phased out, this error is caught earlier, but still results in a program crash.

When your program locks up or data is being changed for no reason, look at the code in the function first. If there is no other explantion for the seemingly random crashes, try looking at pointers in your program to make sure all are pointing correctly and none point to NULL or to some other undesired or unitialized address. Also check to see that arrays are not overstepping their bounds. Narrow functions down to ones that have executed before the crash. Using a debugging tool that is part of your development environment is absolutely essential to finding these errors quickly. Debuggers are not addressed in this tutorial (maybe another time).

Because of the difficulty of debugging pointers, ALWAYS initalize the pointer, preferably to the data you need it to point to, or at least to NULL. Keep in mind you cannot intialize a pointer with data with a constant, since constants are not supposed to be altered, with the exception being strings:

int *ptr = 0; //Same as int *ptr = NULL;, sets an "empty" pointer.
const char *name = "Bob"; //This is okay since "" returns a const char*

Just keep in mind with name, if you assign a string with more than 3 letters, it will overstep its bounds, unless you declare more space:

char name[50] = "Bob";

Below is some common debugging problems with pointers.

Problem occuring in program Possible Statements which cause problem Explanation
Data being changed for no reason
int x[500]; x[900] = 50;
char name[5];
strcpy(name, "String Too Large");
When an array goes out of bounds, it will work "correctly", except it will continue copying into other variable's memory.
"Random" crashes
int *ptr = NULL; *ptr = 50;
Tries to change data at the NULL (0) address, which is not allowed.

In C, to change the value of variables passed to a function, you need to send pointers. This program shows how to achieve the same thing as C++ reference passing. As a side note, keep in mind arrays are pointers too so changing an array passed to a function also changes it outside the function. These are usually passed as pointers.

/* C Code */
void Square(int *numb) {
  *numb *= *numb; //data *= data, remember to deference to get at data
}

void SquareArray(int *arr) {
  for (int c=0; c<10; c++)
    arr[c] *= arr[c]; //can use [] on pointers since arrays are pointers too
}

int main() {
  int x = 5;
  int intarr[10];
  Square(&x); //must use & to pass a pointer, unlike C++ reference
  printf("x = %i", x);
  SquareArray(intarr); //intarr is already a pointer
  return 0;
}

Whenever you declare a variable in a C/C++ program, the compiler will reserve this memory for you, usually before program execution in the .EXE file, in the case of a global, or may allocate memory at run-time on the stack if the variable is in a small function. You can see this directly -- Place a huge, several megabyte array in your source as a global array. Your .EXE may increase by the array's size. Place it in a function and it will not.

However, there is a very serious problem placing that array in a function. When data isn't already allocated globally, it must go onto the stack. The stack size varies greatly depending on compiler an OS but can vary from 16k to a megabyte or more. No reasonably sized stack could be expected to hold a variable the size of a megabyte or even half that (consider that all previous function calls and variables declared within also are on the stack). You can tell you have run out of stack space when your program crashes miserably on you -- this is what happens if you don't allow an "exit" for recursion (When a function calls itself, much like a loop works) and you keep on calling functions until you run out of stack space.

Also as we have seen before, if we declare a variable on the stack, it is destroyed at the end of that function -- this is another disadvantage to placing variables on the stack.

The answer to this problem is called dynamic memory allocation. Whenever you specifcally "ask" the compiler for memory, it will find memory not in the stack, or the program, but instead an area called the heap. When a program needs more memory, it must "get permission" from the operating system or the DPMI. A pointer to the start of the new memory will be returned.

In C++ the commands to manage memory are called new and delete. C programmers should note that there is no C++ equalvalent command for the C realloc() memory reallocation command. Below is an example of delcaring a new variable in the data segment:

int main() {
  int* x = new int;//Grabs a new space
  *x = 500;        //Sets the newly allocated data to 500
  delete x;        //Destroys x variable permanently
  return 0;
}

Note that if we never deleted x, the memory would remain allocated in the program until we exited back to the operating system, so ALWAYS make sure you delete variables after use; this is the main disadvantage to dynamic memory. Also initalizing a variable grabbed by new becomes much more important as the variable will nearly always contain trash and very rarely zero, whereas reserved global variables, as usually cleared out. Also another interesting fact is that if we allocated a new integer right after deleting x, the value of that integer could be 500, depending on how the compiler handles dynamic memory. I say this because the data from the 500 was never really "destroyed" upon deletion, but rather simply "dropped" and ignored by the program, marked as ready to be allocated again. This is exactly where your trash comes from in new variables.

Using the new command for a single variable is noteably fairly useless, as a single variable will fit on the stack. However for a large array, use the new[] operator. The syntax is slightly different for both commands:

int main() {
  int* x = new int[5000]; //Declares a large new int array - new returns a pointer
  x[50] = 10; x[10] = 50; //Do stuff with new array -- acts no different
  delete[] x;             //[] denotes x points to an array rather than single var
  return 0;
}

Since an array is really a pointer, you can use an array declared with new exactly like any other array.

You can only grab a one-dimensional array with new, not a two dimensional. This is a drawback, but there is a way of getting around this by creating an "array of arrays":

int main() {
  int** x = NULL; //Declare pointer to a pointer, or in this case
                  // a pointer to an array of pointers
  int** x = new int*[100]; //Pointers to 100 arrays
  for (int c=0; c<100; c++) {
    x[c] = new int[100];   //Creates an array for each pointer in x array
  }
	
	x[1][1] = 5;    //Works like a 2D array
	
	for (int c=0; c<100; c++) {
    delete[] x[c];
  }
  delete[] x;

  return 0;
}

Remember a computer takes the code literally with zero interpretation. x[c] will return a pointer, and a pointer can be assigned an array with the new command. When it's time to use the array, you could look at it like this: ((x)[1])[1]. X is a pointer to a pointer, where the [] operator will return a pointer, which is really an array, where the [] operator can work on it.

An alternative is to simulate a 2-D array with a one dimensional one:

const int NUM_ROWS = 5;
const int NUM_COLS = 10;


int* x = new int[ NUM_ROWS * NUM_COLS ];

//If we want to access row 3, column 4:
x[ 3 * NUM_ROWS + 4 ] = 10;
delete[] x;

This method is the one used by most compilers when you declare a two-dimensional array. Doing it like this lets you realize just exactly how some of the things in C++ works, and some of the minute technicalities.

C programmers do not have access to the new and delete operators. Instead they must use the malloc() and free() functions. As this tutorial is C++ biased, the explaination will be short. The malloc function allocates a given number of bytes of memory, and the free function releases memory returned by malloc. There is no distinction between single variables and arrays of variables as in C++, and you must calculate the byte size yourself:

/* C Code */
int main() {
  int* x = 0;
  int* myArray = 0;

  x = malloc( sizeof(int) );            //same as "new int;"
  myArray = malloc( sizeof(int) * 20 ); //allocates an array of 20 int

  *x = 10;
  myArray[10] = 10;

  free( x );
  free( myArray );

  return 0;
}
be 500, depending on how the compiler handles dynamic memory. I say this because the data from the 500 was never really "destroyed" upon deletion, but rather simply "dropped" and ignored by the program, marked as ready to be allocated again. This is exactly where your trash comes from in new variables.

Using the new command for a single variable is noteably fairly useless, as a single variable will fit on the stack. However for a large array, use the new[] operator. The syntax is slightly different for both commands:

int main() {
  int* x = new int[5000]; //Declares a large new int array - new returns a pointer
  x[50] = 10; x[10] = 50; //Do stuff with new array -- acts no different
  delete[] x;             //[] denotes x points to an array rather than single var
  return 0;
}

Since an array is really a pointer, you can use an array declared with new exactly like any other array.

You can only grab a one-dimensional array with new, not a two dimensional. This is a drawback, but there is a way of getting around this by creating an "array of arrays":

int main() {
  int** x = NULL; //Declare pointer to a pointer, or in this case
                  // a pointer to an array of pointers
  int** x = new int*[100]; //Pointers to 100 arrays
  for (int c=0; c<100; c++) {
    x[c] = new int[100];   //Creates an array for each pointer in x array
  }
	
	x[1][1] = 5;    //Works like a 2D array
	
	for (int c=0; c<100; c++) {
    delete[] x[c];
  }
  delete[] x;

  return 0;
}

Remember a computer takes the code literally with zero interpretation. x[c] will return a pointer, and a pointer can be assigned an array with the new command. When it's time to use the array, you could look at it like this: ((x)[1])[1]. X is a pointer to a pointer, where the [] operator will return a pointer, which is really an array, where the [] operator can work on it.

An alternative is to simulate a 2-D array with a one dimensional one:

const int NUM_ROWS = 5;
const int NUM_COLS = 10;


int* x = new int[ NUM_ROWS * NUM_COLS ];

//If we want to access row 3, column 4:
x[ 3 * NUM_ROWS + 4 ] = 10;
delete[] x;

This method is the one used by most compilers when you declare a two-dimensional array. Doing it like this lets you realize just exactly how some of the things in C++ works, and some of the minute technicalities.

C programmers do not have access to the new and delete operators. Instead they must use the malloc() and free() functions. As this tutorial is C++ biased, the explaination will be short. The malloc function allocates a given number of bytes of memory, and the free function releases memory returned by malloc. There is no distinction between single variables and arrays of variables as in C++, and you must calculate the byte size yourself:

/* C Code */
int main() {
  int* x = 0;
  int* myArray = 0;

  x = malloc( sizeof(int) );            //same as "new int;"
  myArray = malloc( sizeof(int) * 20 ); //allocates an array of 20 int

  *x = 10;
  myArray[10] = 10;

  free( x );
  free( myArray );

  return 0;
}