Top Banner
Nico Ludwig (@ersatzteilchen) (5) Basics of the C++ Programming Language
29

(5) cpp dynamic memory_arrays_and_c-strings

Jul 08, 2015

Download

Technology

Nico Ludwig

This presentation comes with many additional notes (pdf): http://de.slideshare.net/nicolayludwig/5-cpp-dynamic-memoryarraysandcstrings-38501725

Check out these exercises: http://de.slideshare.net/nicolayludwig/5-cpp-dynamic-memoryarraysandcstringsexercises

- The Heap: Dynamic Memory and dynamic Array Allocation
- Automatic versus Dynamic Arrays
- A Glimpse of the Topic "Stack versus Heap"
-- "Geometric" Properties of the Heap and the Stack
- Lost Pointers and Memory Leaks
- Advanced C-strings: Buffers, Concatenation and Formatting
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript
Page 1: (5) cpp dynamic memory_arrays_and_c-strings

Nico Ludwig (@ersatzteilchen)

(5) Basics of the C++ Programming Language

Page 2: (5) cpp dynamic memory_arrays_and_c-strings

2

TOC● (5) Basics of the C++ Programming Language

– The Heap: Dynamic Memory and dynamic Array Allocation

– Automatic versus Dynamic Arrays

– A Glimpse of the Topic "Stack versus Heap"

● "Geometric" Properties of the Heap and the Stack

– Lost Pointers and Memory Leaks

– Advanced C-strings: Buffers, Concatenation and Formatting

● Sources:– Bruce Eckel, Thinking in C++ Vol I

– Bjarne Stroustrup, The C++ Programming Language

Page 3: (5) cpp dynamic memory_arrays_and_c-strings

3

Automatic Arrays have a Compile Time fixed Size● The size of automatic arrays can't be set at run time:

int count = 0;std::cout<<"How many numbers do you want to enter?"<<std::endl;std::cin>>count;if (0 < count) {

int numbers[count]; // Invalid (in C++)! The symbol count must be a compile-time constant!for (int i = 0; i < count; ++i) {

std::cout<<"Enter number: "<<(i + 1)<<std::endl;std::cin>>numbers[i];

}}

Page 4: (5) cpp dynamic memory_arrays_and_c-strings

4

The Correct Way of creating dynamic Arrays● In C/C++, dynamic memory allocation on the heap is needed in such a case.

– In C, we access the heap with the functions std::malloc() and std::free() in <cstdlib>.

– The process of requesting memory manually is called "allocation of memory".

– std::malloc() creates a block of specified size in the heap memory and returns a void* to this block to the caller.

● For array creation the []-declarator is not used with std::malloc()!

● (Dynamically created arrays can have the length 0!)

– The caller needs to check the returned void* for validity (Did std::malloc() succeed?).

● The returned void* must not be 0!

– The caller needs to cast this void* to the correct pointer-type.

– Then the dynamically created array can be used like an "ordinary" array.

– When the work is done, the array must be manually freed somewhere in the code.

Page 5: (5) cpp dynamic memory_arrays_and_c-strings

5

The Correct Way of creating dynamic Arrays in Code● Let's review and correct the example of dynamic array creation:

int count = 0;std::cout<<"How many numbers do you want to enter?"<<std::endl;std::cin>>count;if (0 < count) { // Create a properly sized block in heap. The function std::malloc() returns a generic

// pointer (void*) and we have to cast this generic pointer to the type we need.int* numbers = static_cast<int*>(std::malloc(sizeof(int) * count));if (numbers) { // Check, whether std::malloc() was successful.

for (int i = 0; i < count; ++i) { // Loop over the dynamically created array: std::cout<<"Enter number: "<<(i + 1)<<std::endl;// Use the block like an ordinary array, e.g. with the []-operator:std::cin>>numbers[i];

}std::free(numbers); // When done with the array, it must be freed!

}}

Page 6: (5) cpp dynamic memory_arrays_and_c-strings

6

Example: Why dynamic Memory is needed: Returning Arrays● Automatic arrays can't be returned from functions:

● In C/C++, dynamic memory allocation on the heap is needed in such a case.– Again, the C way is to use the functions std::malloc() and std::free().

– 1. In GetValues() the dynamic array will be created and returned.

– 2. The caller can then use the returned array.

– 3. Then the caller has to free the dynamically created array!

int* GetValues() { // Defining a function that returns a pointer to aint values[] = {1, 2, 3}; // locally defined array (created on the stack).return values; // This pointer points to the 1st item of values.

}//-------------------------------------------------------------------------------------------------------------------int* vals = GetValues(); // Semantically wrong! vals points to astd::cout<<"2. val is: "<<vals[1]<<std::endl; // discarded memory location.// The array "values" is gone away, vals points to its scraps, probably rubbish!

Page 7: (5) cpp dynamic memory_arrays_and_c-strings

7

The Correct Way of returning Arrays● Let's review and correct the example of returning an array:

int* GetValues() { // Allocate an array of three ints:int* values = static_cast<int*>(std::malloc(sizeof(int) * 3));if (values) { // Check std::malloc()'s success and fill the array. Indeed the

values[0] = 1; // allocation of memory and assigning of array values must bevalues[1] = 2; // separated, when std::malloc() is used (there is no Resourcevalues[2] = 3; // Acquisition is Initialization (RAII)).

} // If std::malloc() failed, let's just "forward" the 0, the caller needs to check for 0!return values; // Return the pointer to the heap block (i.e. to the array).

} //----------------------------------------------------------------------------------------------------------------------int* vals = GetValues();if (vals) { // We have to check for nullity again, GetValues() could have failed!

std::cout<<"2nd value is: "<<vals[1]<<std::endl; // Use vals as array!// >2nd value is 2std::free(vals); // The caller (!) needs to free vals!

}

Page 8: (5) cpp dynamic memory_arrays_and_c-strings

8

Heap Functions' Signatures in Detail

void* malloc(size_t size); // memory allocate in <cstdlib>// size – The size of a portion of the heap in bytes. size_t is a typedef for a unit representing a// count of bytes on a machine, typically it is of type unsigned int.// returns – A generic pointer to the allocated portion, or 0 in case of error (e.g. out of memory).// The pointer is generic because std::malloc() can't know the type of the allocated contents, it// just knows the size that the caller passed and it returns the location of that block in case// of success. The pointer must be cast to the type, the caller awaits. - We select the color of// the contact lenses to view the block.

void free(void* ptr); // in <cstdlib>// ptr – The pointer to a block of content, allocated with std::malloc(), std::calloc() or// std::realloc(). The attempt to free a pointer to a static variable is undefined. Calling std::free()// with 0 is allowed, it just has no effect.

Page 9: (5) cpp dynamic memory_arrays_and_c-strings

9

Wrap up: automatic and dynamic Arrays in Code● Creation of automatic arrays:

● Creation and freeing of dynamic arrays:int* dynamicArray = static_cast<int*>(std::malloc(sizeof(int) * 100));// - Indeed the syntax looks weird, not even similar to the autoArray example.// - The type of the variable we assign to is int-pointer.// - The function std::malloc() is used to create a raw memory-block in heap.// - std::malloc() returns a generic pointer (void*) to this raw memory-block, it does so, because// it doesn't know, what the programmer wants to do.// - So, as programmers we need to tell C/C++ that we want to use the allocated memory// block as int-array, therefor we need to cast the generic pointer (void*) to int-pointer. - We// "put contact lenses on".std::free(dynamicArray); // Free the dynamically created array in the right place.

int autoArray[100];// This is a simple automatic array with a compile time constant size of 100.

Page 10: (5) cpp dynamic memory_arrays_and_c-strings

10

But, how does returning of "normal" Variables compute?● When a local variable is returned from a function, it will be simply copied.

● Arrays can't be returned by value, so they can't be copied!– Here the story is completely different, we have to use the heap generally!

int GetValue() { // Just returns an int:int value = 42; // value is an automatic variable on the stack.return value; // Returns value. value will be popped from GetValue()'s stack.// Then the content of value will be pushed to the stack of the caller function. // In effect value will be copied to its caller when GetValue() returns.

}//------------------------------------------------------------------------------------------------------------------------void Foo() { // Calls GetValue():

int val = GetValue(); // The returned int was pushed on Foo()'s stack by GetValue() and // will be copied into the variable val.

}

Page 11: (5) cpp dynamic memory_arrays_and_c-strings

11

Stack vs. Heap: It's not a Mystery, just two Concepts● The stack is a conceptual place, where local (auto) variables reside.

– This is a little oversimplification, but each function has its own stack.

– The lifetime of a stack variable is visible by its scope (i.e. automatic: auto).

– The stack is controlled by hardware and owned by hardware.

● The heap is a conceptual place, where all dynamic contents reside.– All functions of a program generally use the same heap.

– Dynamic content must be created by the programmer manually.

– The heap is controlled by software, the heap manager (std::malloc(), std::free() etc.).

– There is always an "entity" that is in charge for the allocated heap memory.

– This "entity" is responsible for explicit freeing the allocated heap memory.

– In the end, the lifetime of a dynamic content is controlled by the entity's programmer.

● We'd try to control as less memory as possible manually: using the stack is preferred!

Page 12: (5) cpp dynamic memory_arrays_and_c-strings

12

Stack vs. Heap: In Memory (RAM)● There is the illusion that all the machine's memory is owned by a program.

– This is not true, each program uses its own portion of memory respectively.

– But in the following graphics we stick to this illusion.

● The memory is segmented, different segments have different functions.

● Esp. the stack and heap segment often have special "geometric" properties:– The heap segment resides at lower addresses than the stack segment.

– The addresses of subsequent stack variables are decreasing.

● This is called "descending stack".

– The stack evolves/grows to lower, the heap to greater addresses.

● In fact, stack and heap grow to meet each other halfway!

– Compared to the stack, the heap is very big, because dynamic contents are typically bigger than automatic contents (e.g. local variables).

Page 13: (5) cpp dynamic memory_arrays_and_c-strings

13

Stack vs. Heap: Conventional Locations in Memory

0

232 - 1

Stack segment

Heap segment

0

232 - 1

Page 14: (5) cpp dynamic memory_arrays_and_c-strings

14

The lost Pointer to the Heap Memory in Code

void F(int count) {// Allocating an array of three ints. F() is in charge of the dynamic content, to which// p points!int* p = static_cast<int*>(std::malloc(sizeof(int) * count));// The auto variable p will go out of scope and will be popped from stack. But the// referred dynamic content is still around!

}//-------------------------------------------------------------------------------------------------------------------// Calling F():F(3); // Oops, nobody did free the dynamic content, to which p was pointing to! Now there is// no pointer to the dynamic content in avail. This is a semantic error, a memory leak of// sizeof(int) * 3. The compiler will not see any problem here!

Page 15: (5) cpp dynamic memory_arrays_and_c-strings

15

The lost Pointer to the Heap Memory in Memory

Stack segment

Heap segment

0

232 - 1

? 1 2

4B

0

0xc0005968 p

4B

void F(int count) { // Allocating an array of three ints. int* p = static_cast<int*>(std::malloc(sizeof(int) * count));if (p) { // Check std::malloc()'s success.

for (int i = 0; i < count; ++i) {p[i] = i;

}}

}

// Calling F():F(3);// After F() did run: oops! The pointer to the allocated three// ints is lost, the allocated memory is orphaned. We have a// memory leak of 12B.

:-(

Page 16: (5) cpp dynamic memory_arrays_and_c-strings

16

How to handle dynamic Content responsibly in Code

int* G(int count) {// Allocating an array of three ints. G() is in charge of the dynamic content, to which// p points to!int* p = static_cast<int*>(std::malloc(sizeof(int) * count));// Returning p. Then G()'s caller is in charge of the dynamic content!return p; // The stack variable p will go out of scope and will be popped from the

// stack. But the referred dynamic content is still around!}//-------------------------------------------------------------------------------------------------------------------// Calling G():int* t = G(3);if (t) { // Fine! The returned pointer will be checked and freed correctly.

std::free(t); }

Page 17: (5) cpp dynamic memory_arrays_and_c-strings

17

Handling dynamic Content responsibly in Memory

Stack segment

Heap segment

0

232 - 1

1 2

4B

0

0xc0005968 p

4B

int* G(int count) { // Allocating an array of three ints. int* p = static_cast<int*>(std::malloc(sizeof(int) * count));if (p) { // Check std::malloc()'s success.

for (int i = 0; i < count; ++i) {p[i] = i;

}}return p; // This time: return p!

}

int* t = G(3); // Call G() and receive the pointer.if (t) { // Check and free the content (i.e. the

std::free(t); // memory from the heap). }// The local variable t is still on the stack.

t

1 20

Page 18: (5) cpp dynamic memory_arrays_and_c-strings

18

Potential Problems with Heap Memory● It is needed to check, whether allocation was successful!

● It is needed to free dynamic content in the right place manually.– We have to keep in mind that there is no garbage collection in C/C++.

– So, we should not forget to free dynamically created content!

– We should free dynamically created content as early as possible, but not too early!

– We should not free dynamically created content more than once!

– We should not free dynamically created content that we don't own.

● It's impossible to distinguish pointers to the stack from pointers to the heap.– Don't free pointers to the stack (i.e. pointers not from the heap)! -> It will result in undefined behavior!

● Wherever function interfaces deal with dynamically content, it should be documented where this memory must be freed. Who's the owner? Who's in charge?

Page 19: (5) cpp dynamic memory_arrays_and_c-strings

19

More Information about Heap Memory● There exist two further useful functions to deal with heap memory (<cstdlib>):

– std::realloc() resizes a given block of heap memory.

– std::calloc() allocates a block of size * count and initiates all "items" with 0.

– The returned value must be checked for 0-pointer and freed with std::free().

● Free store in C++:– In C++ the heap segment can also be used as free store.

– The operators new and delete act as interface to the free store.

– These operators represent C++' way to dynamically allocate/deallocate user defined types.

– In general, C's heap memory and C++' free store are incompatible.

● Often 3rd party libraries invent own allocation/deallocation mechanisms:– to deal with platform specialities,

– and/or to encapsulate usage of dynamic contents.

Page 20: (5) cpp dynamic memory_arrays_and_c-strings

20

Putting Heap Memory to work with C-strings● As c-strings are char arrays underneath, they share the limits of other arrays:

– Limitation 1: We can not resize/extend or assign c-strings.

– Limitation 2: We can not return an automatic c-string variable from a function.

● These limitations can be solved by usage of the heap memory:– Pattern for 1: Create a dynamically sized char-array to hold a modified/resized copy of the original c-string. E.g.: Replace a

substring of the original c-string.

– Pattern for 2: Copy a c-string into a dynamically sized char-array and return the pointer from a function.

● Sidebar: Peculiarities of c-strings, not directly shared with other arrays: – C-strings are 0-terminated, so the very last char-array item contains a 0.

– We can get a c-string's length (std::strlen()), this is impossible with other arrays.

– As c-strings are const char-arrays, we can't modify them.

– Indeed we can return a c-string literal from a function!

Page 21: (5) cpp dynamic memory_arrays_and_c-strings

21

The Roles of const char[] and char[]● When we need to create an array that must be filled afterwards, we can not use arrays with const items. Instead we need

arrays as modifiable buffers.– This is needed, cause we need to assign to the items, in order to modify the content!

● So, c-strings are of type const char[], their matching buffer type is char[].– The buffers we allocate dynamically for char-based c-strings are always of type char[].

● Where c-string functions accept const char*'s as parameters, we can safely pass char[] buffers; they will be decayed to const char*.

● To sum up (char-based c-strings):– C-strings are of type const char[], they're not modifiable

– C-string buffers are of type char[], they're modifiable.

Page 22: (5) cpp dynamic memory_arrays_and_c-strings

22

Working with C-strings: Functions for individual Chars● A rich set of functions dealing with individual chars can be found in <cctype>.

● Character predicates (these functions await an int/char and return an int/bool):– std::islower(), std::isupper(), std::isalpha(), std::isdigit(), std::isspace() etc.

● Character casing (these functions await an int/char and return an int/char):– std::tolower() and std::toupper(), their result must be cast to char for presentation.

char ch = '1';if (std::isdigit(ch)) {

std::cout<<ch<<" is a digit!"<<std::endl;// >1 is a digit

}

char ch = 'X';// If ch is no upper case letter, the same char will be returned.std::cout<<ch<<" as lower case: "<<static_cast<char>(std::tolower(ch))<<std::endl;// >X as lower case: x

Page 23: (5) cpp dynamic memory_arrays_and_c-strings

23

Working with C-strings: Parsing C-strings● There exists a set of functions to parse c-strings to fundamental types in <cstdlib>.

– Parsing means reading a c-string and reinterpret its content e.g. as fundamental type.

– Most important: std::atoi() (c-string to int) and std::atof() (c-string to double).

– These functions return 0 in case of error.

// A c-string that can be interpreted as int:const char anInt[] = "42";// Parse the int from anInt's content:int theInt = std::atoi(anInt);

// A c-string that can be interpreted as double:const char aDouble[] = "1.786";// Parse the double from aDouble's double:double theDouble = std::atof(aDouble);

Page 24: (5) cpp dynamic memory_arrays_and_c-strings

24

Working with C-strings: Modify the Case of a C-string● Putting it all together, here a first example of c-string modification:

const char* oldText = "home"; // The original plain c-string. // Create a buffer in the heap, large enough to hold the original c-string. The buffer needs a size of:// sizeof(char) * (count of chars/letters + one byte for the 0-termination).char* newText = static_cast<char*>(std::malloc(sizeof(char) * (std::strlen(oldText) + 1)));if (newText) { // Check std::malloc()'s success.

// Loop over the original c-string and store the upper case variant of each char and the// 0-termination into newText at the same index.for (int i = 0; i < std::strlen(oldText) + 1; ++i) {

newText[i] = std::toupper(oldText[i]);}std::cout<<"The modified text: "<<newText<<std::endl;// >The modified text: HOME"std::free(newText); // Free the buffer.

}

Page 25: (5) cpp dynamic memory_arrays_and_c-strings

25

C-strings: Formatting and Concatenation of C-strings● With the function std::sprintf() in <cstdlib> we can create and format c-strings.

int sprintf(char* buffer, const char* format, ...); // in <cstdlib>// buffer – A char buffer that will contain the effectively created c-string content. The content// of the buffer is the virtual result of this function; buffer is rather an output parameter,// than an input parameter!// format – A c-string that contains a format that describes the disguise of the c-string// to create. This "format-string" defines a template of the c-string to create, with// placeholders of values to be replaced.// ... – An arbitrary set of further arguments that are used to "satisfy" and replace the// placeholders in the "format-string". (The ...-notation is called "ellipsis" in C++.)// returns – The count of resulting chars in buffer (w/o the 0-termination).

Page 26: (5) cpp dynamic memory_arrays_and_c-strings

26

C-strings: "Format-strings" and Examples● There exist many type field characters acting as format placeholders.

– But %d and %s (maybe also %p for pointers) are the most important ones:

– There are different ways to control the format in a more detailed fashion:

● Additional flags control alignment and padding, also the width and precision can be controlled.

– If the format-string and the arguments don't match the behavior is undefined.

char buffer[1000]; // buffer is an automatic fixed buffer of large extend.// The placeholder %d awaits an int to be replaced:std::sprintf(buffer, "Answer: %d items", 42);std::cout<<buffer<<std::endl; // Will print "Answer: 42 items".// The placeholder %s awaits another c-string to be replaced:std::sprintf(buffer, "It's %s's problem!", "Rick");std::cout<<buffer<<std::endl; // Will print "It's Ricks's problem!".// Now concatenation of e.g. three c-strings can be accomplished:std::sprintf(buffer, "%s%s%s", "Weyland Yutani", " at ", "LV-426");std::cout<<buffer<<std::endl; // Will print "Weyland Yutani at LV-426".

Page 27: (5) cpp dynamic memory_arrays_and_c-strings

27

C-strings: Formatting C-strings the Dynamic Way● In most scenarios we don't know, how large the buffer must be to hold the result.

– We can define a very large (maybe auto) buffer, but this is neither safe nor efficient.

– Preferred: we can calculate the buffer length and create it dynamically!

● Let's learn how this works.

Page 28: (5) cpp dynamic memory_arrays_and_c-strings

28

C-strings: Formatting C-strings the Dynamic Way in Code● This is a good way to safely concatenate c-strings with efficient memory usage:

const char s1[] = "Weyland Yutani";const char s2[] = " at ";const char s3[] = "LV-426";// Calculate the length of the resulting c-string:int countOfChars = std::strlen(s1) + std::strlen(s2) + std::strlen(s3);// Allocate buffer with the exact size, sufficient for our situation:char* buffer = static_cast<char*>(std::malloc(sizeof(char) * (countOfChars + 1)));if (buffer) { // Check std::malloc()'s success then do the concatenation:

std::sprintf(buffer, "%s%s%s", s1, s2, s3);std::cout<<buffer<<std::endl;// >Weyland Yutani at LV-426"std::free(buffer); // Free buffer.

}

Page 29: (5) cpp dynamic memory_arrays_and_c-strings

29

Thank you!