Unlike spoken language, which evolved over eons in homo sapiens, the
concept of a written form for language has been invented independantly
a number of times.
The earliest of these inventions quickly developed into Sumerian
cuneiform around 3000 BC.
Wikipedia defines a character as a grapheme which is...
A grapheme
designates the atomic unit in written language. Graphemes include
letters, Chinese ideograms, numerals, punctuation marks, and other
symbols.
Every language has a corresponding finite set of characters, whether it
uses a phonetic system (like English) or one based on ideograms (like
Chinese)
A language's common character set tends to be pretty stable.
When was the last time that English added a new character?
Families of languages (e.g., Romance, Arabic) tend to converge
to a common set of characters for many practical reasons.
So, it's natural to represent a character as an integer, which
identifies the character in the character set.
The important computer character sets you should be familiar with are
ASCII and Unicode
ASCII, the American
Standard Code for Information Interchange, has 128 characters
designed to encode the Roman alphabet used in English and other
Western European languages.
Unicode, or
Universal Character Set, is an international standard designed to
handle all known languages and is becoming widely used on the web.