0
.
String-valued variables are usually declared to be pointers of type char *
.
Such variables do not include memory space for the text of a string. Memory space must be allocated by means
of an array declaration, or a string constant, or dynamic memory allocation.
It is up to you to store the address of the chosen memory space into the pointer variable.
Examples:
Constant string | Array delaration | Dynamic memory allocation |
char * p = "hello"; |
char buf[6]; char * p = &buf[0]; strcpy(buf, "hello"); |
char * p = NULL; p = (char *)malloc(strlen("hello")+1); strcpy(p, "hello"); |
In each of these cases, the following memory space will be allocated:
Note that the first line of code in the rightmost column above stores the NULL pointer in the pointer variable. The NULL pointer does not point anywhere. Attempting to reference the string it points to gets an error.
Important note: All the C standard library functions operating on strings require the NULL terminating character.
Operations on strings (or arrays of characters) are an important part of many programs. The GNU C library provides an extensive set of string utility functions, including functions for copying, concatenating, comparing, and searching strings. To use these functions, include the header file string.h in your program.
The strlen
function returns the length of the null-terminated
string s (not counting the NULL
character).
In other words, it returns the offset of the terminating
null character within the array. For example,
strlen ("hello, world") => 12
When applied to a character array, the strlen
function returns
the length of the string stored there, not the number of bytes allocated for the string.
You can get the size of the character array that holds a string using
the sizeof
operator:
char string[32] = "hello, world"; sizeof (string) => 32 strlen (string) => 12
Function: char * strcpy (char *to, const char *from);
This copies characters from the string from (up to and including
the terminating null character) into the string to. Like
memcpy
, this function has undefined results if the strings
overlap. The return value is the value of to. Example:
Code | Memory snapshot |
char buf[10]; char * p = "hello"; | |
strcpy(buf, p); |
Function: char * strncpy (char *to, const char *from, size_t size);
This function is similar to strcpy
but always copies exactly
size characters into to.
If the length of from is more than size, then strncpy
copies just the first size characters.
If the length of from is less than size, then strncpy
copies all of from, followed by enough null characters to add up
to size characters in all. This behavior is rarely useful, but it
is specified by the ANSI C standard. Example:
Code | Memory snapshot |
char buf[10]; char* p = "hello"; | |
strncpy(buf, p, 2); |
Function: char * strcat (char *to, const char *from);
The strcat
function is similar to strcpy
, except that the
characters from from are concatenated or appended to the end of
to, instead of overwriting it. That is, the first character from
from overwrites the null character marking the end of to. Example:
Code | Memory snapshot |
char buf[10] = "Say"; | |
strcat(buf, " hello"); |
Function: char * strncat (char *to, const char *from, size_t size);
This function is like strcat
except that not more than size
characters from from are appended to the end of to. A
single null character is also always appended to to, so the total
allocated size of to must be at least size + 1
bytes
longer than its initial length.
Here is an example showing the use of strncpy
and strcat
.
Notice how, in the call to strncat
, the size parameter
is computed to avoid overflowing the character array buffer
.
#include <string.h> #include <stdio.h> #define SIZE 10 static char buffer[SIZE]; main () { strncpy (buffer, "hello", SIZE); printf ("%s\n", buffer); strncat (buffer, ", world", SIZE - strlen (buffer) - 1); printf ("%s\n", buffer); }
The output produced by this program looks like:
hello hello, wo
The strcmp
function compares the string s1 against
s2, returning a value that has the same sign as the difference
between the first differing pair of characters (interpreted as
unsigned char
objects, then promoted to int
).
If the two strings are equal, strcmp
returns 0
.
A consequence of the ordering used by strcmp
is that if s1
is an initial substring of s2, then s1 is considered to be
"less than" s2.
Function: int strncmp (const char *s1, const char *s2, size_t size);
This function is the similar to strcmp
, except that no more than
size characters are compared. In other words, if the two strings are
the same in their first size characters, the return value is zero.
Here are some examples showing the use of strcmp
and strncmp
.
These examples assume the use of the ASCII character set.
strcmp ("hello", "hello") => 0 /* These two strings are the same. */ strcmp ("hello", "Hello") => 32 /* Comparisons are case-sensitive. */ strcmp ("hello", "world") => -15 /* The character'h'
comes before'w'
. */ strcmp ("hello", "hello, world") => -44 /* Comparing a null character against a comma. */ strncmp ("hello", "hello, world", 5) => 0 /* The initial 5 characters are the same. */ strncmp ("hello, world", "hello, wide world!!!", 5) => 0 /* The initial 5 characters are the same. */
Function: char * strtok (char *newstring, const char *delimiters);
A string can be split into tokens by making a series of calls to the
function strtok
.
The string to be split up is passed as the newstring argument on
the first call only. The strtok
function uses this to set up
some internal state information. Subsequent calls to get additional
tokens from the same string are indicated by passing a null pointer as
the newstring argument. Calling strtok
with another
non-null newstring argument reinitializes the state information.
It is guaranteed that no other library function ever calls strtok
behind your back (which would mess up this internal state information).
The delimiters argument is a string that specifies a set of delimiters that may surround the token being extracted. All the initial characters that are members of this set are discarded. The first character that is not a member of this set of delimiters marks the beginning of the next token. The end of the token is found by looking for the next character that is a member of the delimiter set. This character in the original string newstring is overwritten by a null character, and the pointer to the beginning of the token in newstring is returned.
On the next call to strtok
, the searching begins at the next
character beyond the one that marked the end of the previous token.
Note that the set of delimiters delimiters do not have to be the
same on every call in a series of calls to strtok
.
If the end of the string newstring is reached, or if the remainder of
string consists only of delimiter characters, strtok
returns
a null pointer.
Warning: Since strtok
alters the string it is parsing,
you always copy the string to a temporary buffer before parsing it with
strtok
. If you allow strtok
to modify a string that came
from another part of your program, you are asking for trouble; that
string may be part of a data structure that could be used for other
purposes during the parsing, when alteration by strtok
makes the
data structure temporarily inaccurate.
The string that you are operating on might even be a constant. Then
when strtok
tries to modify it, your program will get a fatal
signal for writing in read-only memory.
Here is a simple example showing the use of strtok
.
#include <string.h> #include <stddef.h> ... char buf[] = "words separated by spaces -- and, punctuation!"; const char delimiters[] = " .,;:!-"; char *token; ... token = strtok (buf, delimiters); /* token => "words" */ token = strtok (NULL, delimiters); /* token => "separated" */ token = strtok (NULL, delimiters); /* token => "by" */ token = strtok (NULL, delimiters); /* token => "spaces" */ token = strtok (NULL, delimiters); /* token => "and" */ token = strtok (NULL, delimiters); /* token => "punctuation" */ token = strtok (NULL, delimiters); /* token => NULL */ |