Arrays, Strings, Enum, Struct, and Union

Arrays

A collection of homogeneous elements stored at contiguous memory locations.

int arr[3];			// garbage values

int arr[] = {};		// sizeof(arr) is 0

int arr[] = {1, 2, 3};	// {1, 2, 3}

int arr[3] = {};	// {0, 0, 0}
int arr[3] = {0};	// {0, 0, 0}

int arr[3] = {1}	// {1, 0, 0}

int arr[3] = {1, 2, 3, 4, 5};	// excess elements warning in C

// Designated Initializers (>=C99)
int arr[] = {[0 ... 2] = 1}; 	// {1, 1, 1}
int arr[] = {[2 ... 4] = 1}; 	// {0, 0, 1, 1, 1}

int arr[5] = {[0 ... 2] = 0, [3 ... 4] = 1};	// {0, 0, 0, 1, 1}
int arr[5] = {[2 ... 4] 1};					// no equals, rest init to zero: {0, 0, 1, 1 ,1}
int arr[5] = {[0 ... 4] 0, [2 ... 4] 1};		// overlapped: {0, 0, 1, 1 , 1}

int arr[5] = {[4] = 9};		// single element init, rest to zero: {0, 0, 0, 0, 9}
int arr[5] = {[4] 9};		// no equals: {0, 0, 0, 0, 9}
  • Only void and function arrays are not possible.
  • We can allocate memory to stack, heap, data (all three)
local array: stack

static or global array: data

dynamically allocated array using malloc(): heap
  • no index bound checking
int arr[5] = {0};
printf("%d", arr[99]);		// valid, but undefined
printf("%d", arr[-99]);		// valid, but undefined

Arrays and Pointers

  • arrays decay into pointers internally
arr[2] decays as *(arr + 2)

arr[2][3] decays as *(*(arr + 2) + 3)
  • name of an array is a const pointer
int arr[5] = {0};
arr++;		// invalid
  • array name gives address of & the first element
arr == &arr[0]		// true
*arr == arr[0]		// true
  • adding constant offset to array
arr + 1 	// points to next element address

&arr + 1 	// points to address where entire array just finishes 

Static and Global Arrays

  • they are initialized with all elements as 0 (data section of memory)
  • they can’t be variable sized

Variable Sized Arrays

The size of array must be constant literal or constant expression such that it is known at compile time. They are allowed since C99 but certain rules apply:

  1. They can’t be initialized.
  2. They can’t be static, global, or extern.
int n = 5;
int arr[n];

Passing Array to Functions

  • Leftmost parameter is always optional for n-D arrays where n = 1, 2, …
// declaration and initialization
int arr[][2] = {{1, 2}, {1, 3}, {1, 4}, {1, 5}};
 

// function definition
int ladder_search(int arr[][3], int n){ 	// or (int *arr, int n)
	
	// sizeof arr = size of pointer

}	

int main(){

	int arr[2][3];
	ladder_search(arr, n);		// sieof arr = 2*3*4 = 24
}
  • Always passed as a “shallow copy”, unless they’re inside a structure. structure is “deep copied” across functions.

  • static in array passing: (a whole new meaning of this keyword) used to specify the minimum number of elements a passed array must have.

int sum_array(int* arr[static 2], int n){ ... }

sum_array(arr, n);

Strings

"johnwick"  // a string literal

// TYPES

char str[] = "johnwick";	// char array storing string, stored as "johnwick\0"

char *str = "johnwick";		// immutable, string literal allocated space in data section, pointed to by str pointer
// DECLARATION

char str[] = "johnwick";	// sizeof = 9 (includes \0), see # below
char str[10] = "johnwick";	// "johnwick\0\0". sizeof = 10

// caution
char str[8] = "johnwick"; 	// sizeof = 8 (weird but true)
  • decompose to pointer
printf("%s", "johnwick" + 4);	// wick
printf("%s", "johnwick"[4]);	// wick

Arrays and Strings Pitfalls

  • don’t use sizeof on arrays in funtions since they’re passed as pointers when passed to a function
  • strings get concatenated in source code
printf("hello" 				"world");

char *str = "Lorem ipsum dolor sit amet, consectetur adipiscing elit."
" Nulla at dui odio. Praesent diam mi, scelerisque non nulla at, finibus efficitur lectus."
" Praesent risus nulla, dictum sed sapien quis, sollicitudin consectetur justo. Pellentesque habitant morbi"
" tristique senectus et netus et malesuada fames ac turpis egestas. Vestibulum pretium,"
" leo et ultricies gravida, est ante molestie elit, ut elementum tortor nisl vitae orci. "
"Suspendisse vel tempor enim.";

Structures

Collection of dissimilar data types under one name.

struct foo{
	int a;
	char c;
} f1;


// alternatively

struct foo{
	int a;
	char c;
};

int main(){
	struct foo f1;
}
// Initializing

struct foo{
	int a;
	char c;
} f1 = {1, 'a'};		//  global struct variable


// alternatively
struct foo f1 = {1, 'a'};


// below is invalid since struct is a blueprint only
struct foo{
	int a = 1;
	char c = 'a';
};

Designated Initialization: Use .var = or var :.

struct foo{
	int a;
	char c;
};

int main(){
	struct foo f1 = {.c = 'z', .a = 9};

	// struct foo f2 = {c : 'z', a : 9};
}

Accessing Members

  • Dot (.) Operator
struct foo{
	int a;
	char c;
} f1 = {1 ,'a'};

int main()
{
	printf("%d", f1.a);		// 1
	printf("%c", f1.c);		// a
}
  • -> Operator Shorthands *ptr.a to ptr -> a.
struct foo{
	int a;
	char c;
} f1 = {1 ,'a'};

int main()
{
	struct foo *ptr = &f1;
	printf("%d", ptr -> a);		// 1
	printf("%c", ptr -> c);		// a
}
  • Members can’t be static in a struture.

Operations on struct

s1 = s2		// valid

sizeof s1	// valid

// any other operation is invalid on struct or union variables including ==

Memory Alignment, Structure Padding, and Data Packing

Memory is usually byte (8-bits) addresssable and to optimize reading an int (32-bits) four memory banks can be used to fetch an int in one read cycle. If we have a short (16-bits), a char (8-bits), and an int (32-bits), 8-bits get wasted due to no alignment. The “wasted” bits are padded and are counted in total size of struct.

s s c _
i i i i

// int, short, int : wastage = 16-bits
i i i i
s s _ _
i i i i

Reduce padding by declaring strcture members in increasing/decreasing order of their size.

  • Every structure is also aligned with respect to another structures.

Reference

Flexible Array Members

Reference

Bitfields

Specify number of bits a member takes. To save space wastage in union and struct.

struct foo{
	unsigned int a : 2;		// can represent till 4, be careful with signed
	unsigned char c : 6;
};

Forcing alignment with 0

struct test1 {
    unsigned int x : 5;
    unsigned int y : 8;
};
 
// A structure with forced alignment
struct test2 {
    unsigned int x : 5;
    unsigned int : 0;		// this
    unsigned int y : 8;
};
 
int main(){

    printf("%d ", sizeof(struct test1));
    printf("%d", sizeof(struct test2));
}

// Output: 4 8

Reference

Nested Structures

  • Possible, union within struct is also possible.

Anonymous Struct and Union

#include<stdio.h>
struct Scope
{
    // Anonymous structure
    struct
    {
        char alpha;
        int num;
    };
};
 
int main()
{
    struct Scope x;
    x.num = 65;
    x.alpha = 'B';
 
    // Note that members of structure are accessed directly
    printf("%c %d", x.alpha, x.num);
 
    return 0;
}

// Output: B 65

Union

Same properties as struct but varaibles share a single memory allocation. Only first member is initialized.

union bar{
	int x;
	float y;
} u1;

// sizeof u1 = 8 bytes ( = sizeof largest member)

Enums

To declare integral constants.

enum days{sun, mon, tue, wed, thurs, fri, sat};

enum days x = mon;		//enum variable, can be int too
printf("%d", x);

// Output: 1
enum days{sun, mon, tue, wed, thurs, fri, sat} x;

// or declare later
enum days x;
  • automatic assignment of values from 0, if we don’t assign explicitly.
  • any value not assigned gets a value 1 + previous element's value
enum nums{first = 1, second, third = 2};
printf("%d ", first);
printf("%d ", second);
printf("%d", third);

// Output: 1 2 2 
  • values must INT_MIN to INT_MAX

  • constant identifiers can’t be the same (obvious), even in different enums

enum work {yes = 1, yes = 2};	// valid

enum one {foo, bar};
enum two {foo, dog};	// invalid as identifier "foo" appears twice in the same scope

Compound Literals

Allows us to create unnamed objects with given list of initialized values. (>=C11)

struct foo {
	int a, b;
};

int main() {
	struct foo s;
	sum((struct s) {2, 3});		// struct initialized and passed simultaneously
}


int *p = (int []){2, 4, 6};		// unnamed array created and address assigned to p
printf("%d %d %d", p[0], p[1], p[2]);	// 2 4 6