Transcript free list
Lecture 10: Structs and Lists
Records of value
Structured Data
• So far, all variables have stored one value
• A structured record is a variable consisting of fields that
can be of different types
• For example, a person record might have name, age, ...
• To define your own structured record type, use the keyword
struct and the name of your type, followed by the list of fields
inside a block of braces
• The block must be followed by semicolon, as you are
allowed to define a bunch of variables at the same time
• struct person { int age; char name[30]; } p1, p2, p3;
• To access a field, use the dot operator, e.g. p1.age
typedef
• After declaration, you can use your struct as you would use
any other type to declare variables and parameters
• You can even use your struct as a field of another struct
• However, the C syntax doesn't allow you to simply use the
name of your struct as type, but annoyingly, it must be
preceded with keyword struct every time you use it
• Often after struct declaration, we use typedef to make a
handy one-word alias for the struct
• For example, typedef struct person person;
• Since aliases created with typedef reside in a different
namespace than other names, it is legal to reuse the name
Operations on structs
• structs can be assigned to each other and passed to and
from functions either by value or (preferably) pointers
• However, structs can't be compared for equality with ==
• Since struct guarantees that its fields are stored in memory
in the exact order they are listed, there may be slack bytes
between fields to achieve proper alignment
• Equality testing cannot be simple memory comparison byte
by byte, since slack byte values shouldn't matter
• Also if some field is a string, it should be compared with
strcmp, the slack in the end again being irrelevant
• When defining a struct, define your own functions for
equality test, output, etc.
Example: Equality Test for person
int person_equals(person p1, person p2) {
return p1.age == p2.age &&
strcmp(p1.name, p2.name) == 0;
}
int main() {
person bob = { 40, "Bob" }; /* struct initialization */
person bob2 = { 40, "Bob" };
person tom = { 32, "Tom" };
printf("%d %d\n", person_equals(bob, tom),
person_equals(bob, bob2) ); /* 0 1 */
return 0;
}
Interlude: Unions
• For each struct, the compiler precomputes the offset for
each data field in memory from the beginning of struct,
based on the types of fields
• An union is defined exactly like a struct, but with union
• In an union, the field type itself is variant, and the union
instance stores one value at the time
• Simply implemented by defining each field offset to be 0
• Size of union instance equals the size of its largest field
• Rarely needed these days with ample memory
• If you can edit somebody's stdio.h or some other often
included file, a classic prank is #define struct union
Pointers to structs
• Just like with any type, you define a pointer to that type by
appending an asterisk to the end of the type
• If p is a struct person*, you need to dereference the pointer
to access the fields in the struct it points to, e.g. (*p).age
• p.age would be a type error, as pointers don't have fields
• As convenient syntactic sugar, use p -> age
• Often, a struct internally contains a pointer field to another
struct of the same type
• This is necessary in all data structures that consist of
dynamically allocated nodes, since we need pointers to
refer to nodes that don't themselves have names, which are
in a fixed supply in our code
Opaque Data Types
• When defining a data structure that consists of several
nodes that are dynamically added and removed, that data
structure should always be designed to be opaque
• Users of the type never access the data directly, but only
through functions coming with the data type, one function
per possible allowed operation on that data type, getting a
pointer to the data structure as parameter
• For example, a dynamic set might consist of operations
"create set", "add element", "remove element", "search for
element" and "release set"
• The users of the data type call these functions, but never, for
example, explicitly create and release nodes of that data
structure using malloc and free
Interlude: OOP
• An opaque data type is defined in a module whose header
file the users of that type #include, and whose
implementation is linked to the final executable
• This hides the implementation details, but inside the module,
functions still operate on data from the outside
• In object-oriented programming, all functions are written
inside the struct as methods, making that struct a class
• An instance of a class is called an object
• In a sense, objects are live and control their own behaviour,
instead of being "dead data" operated on by outside
functions
• Subtyping with inheritance to create subclasses
• Method calls resolved polymorphically at runtime
Linked Lists
• Classic simple implementation of a dynamic set, a stack, a
queue, or many other opaque data structures
• A linked list consists of a chain of nodes, each node storing
one element (a key) and a next pointer to the next node
• Unlike arrays, lists are not a random access structure
• Problem: how do we get to the first node, from which we can
then step to any other node?
• A separate header node represents the list as a whole, and
contains a head pointer to the first actual node
• If the list is currently empty, head == NULL
• For the last node in the chain, next == NULL (a simple linear
list) or points back to the beginning (a cyclic list)
Example: sln.h
#ifndef SLN_H
#define SLN_H
struct sln_node {
struct sln_node* next;
int key;
};
typedef struct sln_node sln_node;
struct sln_list {
struct sln_node* head;
};
typedef struct sln_list sln_list;
/* Function prototypes */
#endif
Recycling Unused Nodes
• Many data structures that consist of nodes that
are dynamically added and removed to it tend to be
constantly allocating and releasing such nodes
• For greater efficiency, instead of actually releasing an
unused node, recycle it in a separate free list
• Whenever you need a new node, instead of using malloc to
create a node, take one from the free list if there are any
• Nodes don't go sour in the memory, so the node that you
take from the free list is just as good as a fresh new one
• Can yield a noticeable speedup in some applications
Free List Operations
sln_node* sln_allocate_node() {
sln_node* n;
if(freelist == NULL) {
freelist = malloc(sizeof(sln_node));
freelist->next = NULL;
}
n = freelist; freelist = n->next; n->next = NULL;
return n;
}
void sln_release_node(sln_node* n) {
n->next = freelist; freelist = n;
}
Creating a New List
/* Create a new singly-linked list. We only need to allocate the
header node and initialize its head field to NULL. */
sln_list* sln_create() {
sln_list* list = malloc(sizeof(sln_list));
list->head = NULL;
return list;
}
Releasing the Entire List
/* Release the list and all its nodes. */
void sln_release(sln_list* list) {
sln_node* n = list->head;
sln_node* m;
while(n != NULL) {
m = n->next;
sln_release_node(n);
n = m;
}
free(list);
}
Opaque Data Type Lifetime
• When we define an opaque data type, we define functions to
create and release one instance of it
• The user code should never have to use malloc and free
explicitly, but this logic is encapsulated in our functions
• However, the user code is still responsible for not using an
instance that has been released
• In C, it is not possible to reliably determine if the instance
has been released, since once a memory block has been
released, anything can happen to its contents
• Those contents might get set to whatever values you
interpret to mean "this block is still in use"
• For the same reason, you can't look at memory and
determine if the instance has been properly created
Inserting a Key
/* Insert a new element to the list. In a linked list, inserting a
node to the beginning of the list is especially easy. */
void sln_insert(sln_list* list, int key) {
sln_node* n = sln_allocate_node();
n->key = key;
n->next = list->head;
list->head = n;
}
Searching for a Key
/* Check if the list contains the given key. */
int sln_contains(sln_list* list, int key) {
sln_node* n = list->head;
while(n != NULL && n->key != key) {
n = n->next;
}
return (n == NULL)? 0: 1;
}
Removing a Key (1)
int sln_remove(sln_list* list, int key) {
sln_node* n;
sln_node* m;
n = list->head;
if(n == NULL) { return 0; }
if(n->key == key) {
list->head = n->next;
sln_release_node(n);
return 1;
} /* Continued on next slide */
Removing a Key (2)
while(n->next != NULL && n->next->key != key) {
n = n->next;
}
if(n->next != NULL) {
m = n->next;
n->next = m->next;
sln_release_node(m);
return 1;
}
return 0;
}
FIFO Queue with Flexible Arrays
• A linked list can be used to implement a last-in-first-out
stack, or a first-in-first-out queue
• Push new element in one end, and pop the element in the
same end (stack) or the other end (queue)
• Alternatively, elements can be stored in an array
• To avoid having to resize the array all the time, we maintain
slack in the end of the array
• Field capacity tells the true size of the underlying array,
whereas size tells how much of that capacity is in use
• When pushing an element to an array that is full, we resize
by allocating a new array twice as big, so there is again
good slack for some time
arrayqueue.h
#ifndef ARRAYQUEUE_H
#define ARRAYQUEUE_H
struct array_queue;
typedef struct array_queue array_queue;
array_queue* aq_create();
void aq_release(array_queue*);
int aq_size(array_queue*);
int aq_pop_front(array_queue*);
int aq_push_back(array_queue*, int);
#endif
Storing a Queue in an Array
• A LIFO stack is easier to implement in a flexible array than a
FIFO queue, since both push and pop occur in the end
• Need to maintain one index size to store the current end
• In a FIFO queue, pop happens in the other size than push
• If we push by adding the element to the end, pop must take
the oldest element from the front
• Unlike in a stack, front does not remain at the beginning of
the array
• A ring buffer consists of an array and two indices head and
tail, so that tail chases the head
• Once either index goes past the end of the array, it comes
back to the beginning (hence the name)
The arrayqueue structure
const int INITIALCAPACITY = 2;
struct array_queue {
int* elements;
int size, capacity, head, tail;
};
typedef struct array_queue array_queue;
Creating an arrayqueue
array_queue* aq_create() {
array_queue* aq;
aq = malloc(sizeof(array_queue));
aq->capacity = INITIALCAPACITY;
aq->head = aq->tail = aq->size = 0;
aq->elements = (int*) malloc(sizeof(int) * INITIALCAPACITY);
return aq;
}
Releasing an arrayqueue
/* In writing aq_release, we have to be careful to release the
array aq->elements before releasing the struct aq itself.
Otherwise, if we release aq first, we can no longer access its
field elements to get to the element array. Note how releasing
memory blocks in a data structure typically happens in reverse
order to how they were allocated. */
void aq_release(array_queue* aq) {
free(aq->elements);
free(aq);
}
Expanding for Push
if(aq->size == aq->capacity) {
int* new_elements = (int*) malloc(sizeof(int) * 2 * aq->size);
for(i = 0; i < aq->size; i++) {
new_elements[i] = aq->elements[aq->tail];
aq->tail = (aq->tail + 1) % aq->capacity;
}
free(aq->elements);
aq->elements = new_elements;
aq->tail = 0; aq->head = aq->size;
aq->capacity = 2 * aq->capacity;
}
Pushing an Element
int aq_push_back(array_queue* aq, int key) {
int i;
expand_if_full(aq);
aq->elements[aq->head] = key;
aq->head = (aq->head + 1) % aq->capacity;
aq->size++;
}
Popping an Element
int aq_pop_front(array_queue* aq) {
int key = aq->elements[aq->tail];
if(aq->size == 0) { printf("Error!\n"); exit(1); }
aq->tail = (aq->tail + 1) % aq->capacity;
aq->size--;
return key;
}
"Object-oriented programming is an exceptionally bad idea
which could only have originated in California."
Edsger W. Dijkstra