RSS

Tag Archives: struct

Linked Lists and Pointers – struct node**

As I was reading up on linked lists is C and C++, I challenged myself to write an implementation for the data structure itself (i.e. the nodes) and the list operations (push, pop, etc). On my first go, I wrote the following:

struct node
{
    int data;
    struct node* next;
};

void push (struct node* head, int data)
{
    struct node* newNode = malloc(sizeof(struct node));

    newNode->data = data;
    newNode->next = head;

    head = newNode;
}

The node implementation is correct, of course, but the implementation for push is wrong. Any guesses?

Here is the correct implementation for push:

void push (struct node** head, int data)
{
    struct node* newNode = malloc(sizeof(struct node));

    newNode->data = data;
    newNode->next = *head;

    *head = newNode;
}

My guess would be that your first question upon reading this code is: “What is this struct node** business?” Well, that was my question, and it took me a few articles to figure it out. I first came across the problem in a programming interview book, Programming Interviews Exposed, but it took me until reading Nick Parlante’s article on Linked Lists in the Stanford CS Library to really understand it. I’ll save you the trouble of reading the whole document (although it is worth a read), but the gist of it is this.  When you pass a parameter to a function, whether it’s a primitive, composite (array), struct, or pointer, the function allocates its own stack memory for each of the parameters upon the function call, just as it does for any other local variable.  What this means for our purposes is that the first (incorrect) implementation of push only changes the value of the local variable head and not the value of the pointer that you originally passed, because in this case, head is a pointer to the struct data, not a pointer to the pointer-to-struct.  If you instead pass struct node** head, head then is a pointer to the pointer-to-struct.  By dereferencing head, you then can change the memory address stored at that memory location.

Moral of the story?  Pass a pointer to the data typeof the data that you want to change.  So if that’s an int, pass an int*, if that’s a pointer-to-int, pass an int**.  Seems easy enough…hope I follow my own advice!

 
Leave a comment

Posted by on January 2, 2012 in General, Programming

 

Tags: , , , , , , , , , ,

Pointers: A Five Minute Primer

Whenever anyone thinks about C, the first thing that comes to mind is pointers. In order to harness the full capabilities of what is a syntactically minimal programming language, you have to understand how C handles memory management, and how you can actually manipulate computer memory using C constructs. Here are the essentials:

What is a pointer?

The most basic way to deal with data in a function is to declare a variable and then initialize it with a value. For example, in the following case:

int x = 5;

We can think of the data value 5 being stored directly in a slot in memory named “x.” In the case of a pointer, however, the memory slot does not directly store the data. Instead it stores a memory address at which we can find the data. So if we look in slot “x”, we will find something to the effect of 0x714ae023, a memory address (in hexadecimal).

How to declare a pointer

Building on our previous example, let’s say you want a pointer to an integer (pointer-to-int). You would declare it as follows:

int* myIntPointer;

Thus the variable x is not of type int, but of type int*, a pointer-to-int (note you can also write int *x, but the former is better practice because it is less confusing; more on that in a second).

Pointer dereferencing

After you have declared a pointer variable, you might be tempted to say something like this:

myIntPointer = 5;

While the above is valid C code and will compile, it does not have the desired effect. What you have to remember is that myIntPointer stores a memory address. So rather than storing the data value 5, the computer (and your program) will think you are storing the memory address 5, and will look at that address for the data value.

So what do you do instead? In order to access and set the data stored at the memory address pointed to by your pointer, you use *, the pointer dereference operator. So in order to store the data value 5 at the address pointed to by your pointer, you would do this:

*myIntPointer = 5;

But wait, that’s still not correct…yet. Although you have declared the pointer variable (and thus the memory in which to store the address itself), you have not yet allocated memory in which to store the data value. When you first initialize. You need to use malloc, which is the C function that allocates a chunk of memory of the desired size and returns the address of that memory. It looks like this:

int* myIntPointer;
myIntPointer = malloc(sizeof(int));
*myIntPointer = 5;

One more thing to know. Any memory you have malloc-ed will not be reuseable by your program unless you explicitly tell your program to get rid of the data (see the discussion of stack vs. heap memory). So once you are done using a pointer and the relevant memory, you need to call free:

//code that uses myIntPointer

free(myIntPointer);

And that’s it! Then your program will be able to utilize the memory for another purpose, perhaps when you malloc memory for other data.

Aside: Pointers and structs

Let’s say you have the following struct declaration and intializations:

struct widget
{
    char name[20];
    double price;
};

int main()
{
    //random source code

    struct widget baseball = {"baseball", 9.99};
    struct widget* football = {"football", 19.90};
    ...
}

Normally, if you have a struct variable, as in the case of the ‘baseball,’ you use the dot (.) operator to access/change the name and price properties of the widget as follows:

{
    ...
    printf("%s", baseball.name);
    baseball.price = 5.00;
    ...
}

However, when you have a pointer-to-struct, you use the pointer dereference operator (->), as follows:

{
    ...
    printf("%s", football->name);
    football->price = 10.00;
}

How does C manage memory, and where do pointers fit in?

In C, there are two types of memory, local (or stack) memory and heap memory. Whenever a function is first called, allocates memory for all its variables in an area of memory called the stack. It is called the stack because every time a new function is called, your program allocates memory “on top” of the previous memory, and reads from that memory until the called function terminates. At that point, the program, having maintained a pointer to the calling function, will “pop” the memory for the called function off the top of the stack (i.e. deallocate it) and return to the previous function. That’s why if you declare a variable and store data to it within a function, the data is inaccessible to other functions once that function has terminated.

So how do we avoid this constraint? This is where heap memory comes in. Heap memory is a separate block of memory accessible to your program, but it is only used when you as a programmer explicitly decide to use it. In C, this would take the form of a malloc call, in C++, by use of the keyword new, and in Objective-C, with the message alloc sent to the classname. This provides you with memory that will persist beyond the lifetime of the function in which it is allocated; memory in the heap is not subject to the same lifetime constraints of stack memory. Thus, by maintaining a pointer to the malloc-ed memory, you can access and manipulate the stored data across multiple functions.

Think you understand pointers? Then here’s your first quiz: Knowing what you now know about memory management in C, what is the conceptual difference between:

struct widget myWidget;
myWidget.name = "baseball";
myWidget.price = 10.0;

and:

struct widget* myWidget = malloc(sizeof(struct widget));
myWidget->name = "baseball";
myWidget->price = 10.0

The answer, if you understand the stack and heap memory, is actually quite simple. The memory for the first struct is allocated upon function call from the stack, so the data persists only the length of the function. As long as the function has not finished executing (e.g. you are currently executing a called function), you can access and manipulate the data all you want. However, as soon as the function terminates, so too does the data.

The second version, in which the function maintains a pointer to the data, is different because the data is stored in memory allocated from the heap. This data will persist even after the function terminates, allowing it to be accessible (as long as you have a pointer to it) until you deallocate the memory with free.

If you’re looking for more help, check out Nick Parlante’s guide to pointers in the Stanford CS Library. And of course, if you have any questions for me, leave a comment! I’d be glad to help.

 
Leave a comment

Posted by on December 26, 2011 in General, Programming

 

Tags: , , , , , , , , ,