C - Low level, structured, procedural, and statically typed programming language. Some more info here.
Developed at Bell Labs in 1970s by Dennis Ritchie and Ken Thompson.
Inspirations: BCPL -> B -> C
Standards: K&R C (unofficial), ANSI C (aka C89), C99, C11, C18, C2x (to be reviewed in Dec 2021).
Undefined - Dangerous, often mentioned in standard, causes surprises. Ex - Divide by 0, segment violations.
Unspecified - Not specified by standard. Ex - Order of function parameter evaluation.
Implementation defined - Vendors are free to implement and document. Ex - Size of data types.
#include<stdio.h>
int main(int argc, char*[] argv){ // char** argv
printf("Hello World!");
return 0;
}
main() // can be called with any number of parameters
main(void) // no parameters can be present during call
Static Libraries
|
Preprocessor -> Compiler -> Assembler -> Linker -> Loader
.i .s .o/.obj .exe
SIGSEGV - Segment Violation (invalid access to valid memory, access to memory we don't have access to)
SIGBUS - (access to an invalid address)
SIGABRT - an abort() call throws this
SIGFPE - Floating-point exceptions (divide by zero error)
SIGILL - Illegal statement (CPU tried to execute an instruction it didn't understand)
SIGSYS - invalid argument passed to a system call
SIGTRAP - an exception triggered explicitly by debugger
There are five types of tokens: keywords, identifiers, constants, operators, and separators.
There are a total of 44 keywords in C (C89 – 32, C99 – 5, C11 – 7).
Name of variables, functions, etc…
Integer - int, octal (0), hex (0x)
Real - float(f or F), double(d or D)
Character - 'a'
String - "abhi"
Escape Sequences - '\0', '\"'. '\xFF'
Arithmetic - +, -, *, /, %
Relational - <, >, <=, >=, !=, ==
Logical - &&, ||, !
Bitwise - &, |, ~, ^
Assignment - =, compound assignments
Conditional - ?:
Unary - ++, --
Others - comma (,), dereferencing (*), addressof (&)
comma (,)
{}, (), []
// this comment extends to next line \
printf("hello");
printf("world");
Also works with macros:
#define REPLACE(a, b) if(a < b) \
a = b;
They can appear anywhere in program.
Preprocessor directive : #
#include<stdio.h> // searches in standard list of dir
#include "myheader" // searches in the current folder first, then standard list of dir
#define PI 3.1415 // macro as object
#define MUL(a, b) a * b // macro as function
#undef
#ifdef
#ifndef
#endif
#if
#endif
#elif
#else
#line n "filename" //filename is optional
#warning here is a warning message to show //shows a compile-time warning message
#error here is your error description //throws compile time error
#pragma once //not standard
warn
startup
exit
Mostly used for debugging.
__LINE__
__TIME__
__DATE__
__FILE__
__FUNC__
#define string_this(a) #a
printf("%s", string_this(Hello));
#define combine_these(a, b) a##b
char* name = "Logan";
printf("%s", combine_these(na, me));
They have no control over data types.
1. include guard 2. No nesting
#define foo (foo + 10)
foo
// foo will be replaced by (foo + 10), not further
3. No macro definition inside a macro definition
#define DEF #define A 999
DEF // won't work
4. Redefining macros
#define A 99
printf("%s", A);
#undef // or #define A 88
printf("%s", A);
5. Macro default initialization By default, macros are initialized to Zero (0) is not defined previously.
#if X == 3
printf("Yes");
#else
printf("No");
#endif
6. Macros don’t get pasted in other macro definitions
#define A define
#A B 99 // compile-time error
7. Not closing macro ifs is a compilation-error.
8. An innovative way to block commnent:
#if 0
printf("%s", "A");
printf("%s", "B");
printf("%s", "C");
#endif
9. Macro arguments are not evaluated before macro expansion
#include <stdio.h>
#define MULTIPLY(a, b) a*b
int main()
{
// The macro is expanded as 2 + 3 * 3 + 5, not as 5*8
printf("%d", MULTIPLY(2+3, 3+5));
return 0;
}
// Output: 16
10. An Interesting Case
#include<stdio.h>
#define A -B
#define B -C
#define C 5
int main()
{
printf("The value of A is %d\n", A);
return 0;
}
Preprocessed File -
int main()
{
printf("The value of A is %d\n", - -5);
return 0;
}
we want: eprintf ("%s:%d: ", input_file, lineno)
→ fprintf(stderr, "%s:%d: ", input_file, lineno)
#define eprintf(format, ...) fprintf (stderr, format, __VA_ARGS__)
argc
will atleast be 1 since argv
always has argv[0]
as program filename itself.
// Input: a.exe my code
#include<stdio.h>
int main(int argc, char* argv[])
{
printf("%s ", argv[0]);
printf("%s ", argv[1]);
printf("%s ", argv[2]);
}
// Output: a.exe my code
Declaration: declaring variable for the enclosing scope.
Definition: allocating memory to it.
int a = 9; // both
int a; // both, garbage value assigned
extern int a ; // declaration only
In C, variables are always statically (or lexically) scoped i.e., binding of a variable can be determined by program text and is independent of the run-time function call stack.
For example, output for the below program is 0, i.e., the value returned by f() is not dependent on who is calling it. f() always returns the value of global variable x.
# include <stdio.h>
int x = 0;
int f(){
return x;
}
int g(){
int x = 1;
return f();
}
int main(){
printf("%d", g());
printf("\n");
getchar();
}
// Structural nature of C
#include<stdio.h>
int main()
{
int a = 9;
{
int b = 8;
}
printf("%d", a);
printf("%d", b);
}
// Compile-time error: 'b' undeclared
There are basically four scope rules in C:
Block: if-else, loops, etc…
if(1){
// block scope
}
File: Global variables
// file scope
int a = 5;
int main(){
}
Function Prototype: only till a funtion prototype/declaration
int sum(int a, int b);
// or
int sum(int, int);
Function: function definition body scope
int sum(int a, int b)
{
return a + b;
}
Five storage classes in C are: auto, extern, static, register, typedef. We can’t have multiple storage class for a single variable.
auto: any variable inside some block, all variables are auto by default.
extern: only useful when multiple souce files are present and need to share a variable defined once. It signifies that the variable is defined elsewhere and not within the same file where it is being used.
//DEMO 1: Global variable within the same file
#include<stdio.h>
int main()
{
extern a; //if we comment this, we get error
printf("%d", a);
}
int a = 99;
//Output: 99
//DEMO 2: Global variable in another file
//file1.c
#include<stdio.h>
int main()
{
extern int a; //if type mismatch happens, 0 is fetched, also commenting leads to error
printf("%d", a);
}
//file2.c
int a = 88;
// compile as => gcc file1.c file2.c
// Output: 88
//DEMO 3: Extern works only on globally scoped variables
#include<stdio.h>
int main()
{
int a = 9;
{
extern int a;
printf("%d", a);
}
}
//Compile-time error, undefined reference to 'a'
extern int a;
extern int a;
extern int a; //can do this mutiple times as its declaration and not definition
extern int a = 9; //can't do this, compile-time error, extern can't have initializer
static: serves two purposes:
they don’t get destroyed even after the flow goes out of scope (block scope
),
static variables and functions can’t be accessed outside the file in which they’re defined (static linkage in file scope
)
they are initialized inside data area of memory, hence their default value is always 0
.
static variables can only be initialized using constant literals, whose value is known during compile-time.
Static variables should not be declared inside structure. The reason is C compiler requires the entire structure elements to be placed together (i.e.) memory allocation for structure members should be contiguous. It is possible to declare structure inside the function (stack segment) or allocate memory dynamically(heap segment) or it can be even global (BSS or data segment). Whatever might be the case, all structure members should reside in the same memory segment because the value for the structure element is fetched by counting the offset of the element from the beginning address of the structure. Separating out one member alone to data segment defeats the purpose of static variable and it is possible to have an entire structure as static.
register: same as auto but stored in fast CPU register, address can’t be retrieved using pointers. Avoid making an array with this as it can’t be decomposed into pointer for access or passing to functions.
typedef: assign alternative names to existing types:
typedef long long ll;
typedef unsigned long long ull;
typedef struct* my_structure s;
typedef static int points; //compile-time error, multiple storage classes are not allowed
The only storage-class specifier that shall occur in a parameter declaration is register
.
const: whatever value is assigned to it during declaration can’t be changed.
const a; //assumed to be int by default
int const a = 9; //same as storage class, position doesn't matter
const int a = 9;
volatile: tells compiler not to optimize access to these variables since they will be accessed and changed a lot, often from outside the program in ways which compile may not be aware of.
volatile int a = 9;
int volatile a = 9;
const volatile int a = 9; //we can do this, although it makes no sense
Describes where the linker finds variables and functions to link.
No Linkage: Block scope (register
, auto
, static
within block)
Internal: File Scope (static
)
External: Multiple files (extern
)
While talking about linkage we’re not talking about file inclusions using #include
but actual linking of two sources by $gcc file1.c file2.c
.
0
or NULL
by default.static
variable can only be accessed in the same file. (Internal Linkage)//file1.c
#include<stdio.h>
int main()
{
extern int a;
printf("%d", a);
}
//file2.c
static int a = 88;
//Output: compile-time error, undefined reference to 'a'
#include<stdio.h>
int a;
int a;
int a = 9;
static int b;
static int b = 9;
int main()
{
printf("%d", a);
}
//Ouput: 8
When multiple files are compiled and linked, there might be clashes in naming of global variables and functions. In that case, linker needs to decide what to keep and what not to.
Initialized global variables and function: Strong Symbols
Uninitialized global varaibles: Weak Symbols
Rule#1: Two strong symbols with the same name are not allowed. Rule#2: If one strong symbol with the same name as one weak is present, choose the strong one. Rule#3: If multiple weak symbols with same name are present, choose any (dangerous, undefined behaviour, difficult to debug).
Link: https://www.geeksforgeeks.org/complicated-declarations-in-c/