UMBC CS 201, Spring 02
UMBC CMSC 201 Spring '02 CSEE | 201 | 201 S'02 | lectures | news | help

CMSC 201
Programming Project Four

Name That Language

Out: Sunday 4/14/02
Due Date: Sunday 4/28/02, before midnight

The design document for this project, design4.txt, is due: Before Midnight, Sunday 4/21/02

The Objective

The purpose of this assignment is to give you practice with strings and chars, string and char library functions, sorting, using arrays, allocating memory dynamically, and reading files and using command line arguments.

The Background

Statistics about documents are used for many purposes by computer scientists. A tally of specific word occurrences within documents can be useful for determining the similarity of documents. This is helpful for document retrieval and is used by modern search engines for the Internet. Analysis of words used within a document can even determine authorship. The number of occurrences of letters can be useful, and has been well studied in the area of cryptology. The percentages of individual letters that occur within a document are language specific. This fact can even help determine the language of an encoded message.

There is a famous island monastery/fortress located on the English Channel in Normandy, France known as Mont Saint Michel. As a popular tourist site, it attracts people from all over the world who of course speak many different languages. To handle this diverse population of visitors, pamphlets detailing the history of Mont Saint Michel are published in English, French, Spanish, Italian and German. The English version of the pamplet is provided as the sample run.

The Task

The good news is that Ms Bogar was fortunate enough to have the chance to visit Mont St. Michel and brought back several pamphlets, each in a different language. The bad news is that Ms Bogar can't tell which language is which. Your job is to help Ms Bogar identify which pamphlet is in which language.

You will be given four files which contain the text from the pamphlet available at Mont Saint Michel. Your task is to write a program that takes a command-line argument which is the name of the file to be read, reads the text, analyzes it and determines the language in which the text was written -- Italian, German, Spanish or French. Each file will be a different language and each file is entirely in that same language. You will determine which language the file is written in based on the frequency that certain letters appear in the text. The following table represents the 5 most frequently used letters in the document in Italian, Spanish, French and German, in order.


Program Requirements


Program Hints

Sample Run

This sample run used the English pamphlet as the input data.

linux1[20] a.out english.txt The string is : Mont Saint Michel History- The Wonder of the West Mont Saint Michel is one of the medieval West's major legacies of its sacred history. Dedicated to Saint Michael in 708 following some miraculous visitations, in 966 it was entrusted by the Duke of Normandy to the Benedictine monks who made the island one of the most important places of pilgrimage in the Christian world, by building on the legend of the founding bishop, Aubert. The monks set about a superhuman construction program with work continuing without interruption from the year 1000 to the beginning of the 16th century. Thus the visitor will gain a comprehensive picture of medieval architecture as he explores its many buildings, squeezed onto the tip of the rock. Mont Saint Michel was also an impregnable fortress. Its heroic resistance to the English during the Hundred Years War earned it a symbolic place in the national psyche. The ramparts enclosing the village and the abbey fortifications bear witness to this powerful role. After the conversion of the abbey into a prison, which remained from the revolution until 1863, the monastery, designated an historic monument in 1874, underwent major restoration work. These works enable visitors to enjoy once again the splendor of a building that men in the Middle Ages saw as the image of Holy Jerusalem on earth. ----- Follow the Guide After entering the Guard Room, the fortified entrance to the abbey, the visitor climbs the Ceremonial staircase, which is the formal entrance to the abbey church. The path then passes between the church, on the right, and the abbey lodgings, on the left, linked by hanging passages. These rooms, constructed from the late 14th to the early 16th centuries, were the official residence of the abbots. The west terrace is formed from the primitive square in front of the abbey church and the first bays of the nave destroyed in the 18th century after a fire. The neoclassical facade was rebuilt in 1780. From here there is a panoramic view over the bay, from le Grouin point to Champeaux point. To the west is Mont Dol and to the north, the small island of Tombelaine. The terrace also offers an excellent view of the neo-Gothic spire of the bell tower constructed in 1897 and the embossed copper and gold leaf statue of the archangel. The abbey church is built on the tip of the rock, on a platform consisting of four crypts which surround it and support the four arms of the cross. The elevation of the nave, typical of the Norman Romanesque style, is on three levels- arcades, triforium galleries and clerestory. The framework of the nave was clad in paneled barrel vaulting, as were most Romanesque churches in Normandy. The Romanesque chancel, which fell down in 1421, was rebuilt between 1446 and 1521 in the flamboyant Gothic style. The visit continues to the north of the church, with the Gothic monastery known as the "Merveille", the Wonder, because of the outstanding nature of the building. It was constructed after the fire of 1204, which devastated the abbey. The cloister looks out over the sea, to the north, and gives access to the refectory, the kitchen, church, dormitory, chartulary and various staircases leading to the lower levels. Around the garden, restored in 1965, the design of the cloister colonnades, where the height of the small columns is that of the human body, created an intimate setting in which the monks could meditate. The decorations on the quoins, sculpted in Caen stone, which can be carved into more elaborate designs than the granite of the buildings, was originally painted. Today only traces of plant material can be distinguished. In the vast refectory the monks took their meals in silence, while the reader read to them from the pulpit in the south wall. Narrow windows are set into the side walls of this room, invisible from the entrance but which allow a stream of light to pass through. Access to the lower floor is via a staircase where the monks work room is to be found, later known as the "Salle des Chevaliers", the Knight's Hall, together with the Guest Hall where distinguished guests were received. On the ground floor poor pilgrims were fed and lodged in the almshouse, a vast hall divided into two naves by a row of columns. The nearby cellarium, an immense cool and in shadow-filled storeroom, is divided into sections by two rows of square pillars to ensure that provisions were stored in a logical fashion. A large model dated 1701, a copy of an original made in 1690 and conserved in the Relief-Map museum in Paris, is on display in the cellarium. It shows Mont Saint Michel as it was before the revolution. There is also a life size maquette of Saint Michel de Fremiet who crowns the church spire. The way out is through the gardens on the north side whose peaceful paths meander, facing the immensity of the bay, beneath the steep walls of the "Merveille". Clerestory- a series of windows in the upper part of a church building, but clear of the roof, admitting light to the central area of a built space. Chartulary- Where the abbey records were kept. Quoin- The roughly triangular shaped area between the tops of two arches. The three levels of the monastery reflect, from top to bottom, the structure of medieval society- clergy, nobility, and third estate, and the hierarchy of nourishment- spiritual, intellectual and material. ----- Architecture- Medieval architecture The abbey of Mont Saint Michel offers a complete overview of medieval architecture. Pre-Romanesque architecture is represented by the church of Notre-Dame-sous-Terre, 10th century, where traditional Romanesque features can still be seen - very thick walls constructed of small rubblestone and Norman arches clad in flat brick. The 11th century offers Romanesque volumes at their fullest in the crypts of the transept and the south side of the nave of the church. The masonry facings are meticulously laid out in a regular pattern with fine jointing. The 12th century sought a lighter style of construction and used the pointed arch in the lower north side of the nave. In the Ambulatory, the architects conceived vaults rising over a skeleton of diagonal arches. This innovation was to lead to the birth of the Gothic style. The new process permitted the massive and thick Romanesque vaultings to be replaced by a delicate vaulted structure supported by arches. Since the weight was thus distributed over the pillars, larger and larger openings could be made in the walls. The first floor of the "Merveille", the Wonder, dating from the 13th century, demonstrates the mastery of this system of construction. The Flamboyant style 15th century chancel expresses the culmination of Gothic architecture. Since the vaulting rests on fine pillars, supported on the outside by majestic flying buttresses, the sanctuary could be transformed into a space bathed in light. Norman arch- A semi-circular arch, a revival of the Roman style (hence Romanesque). Ambulatory- In French "Promenoir" - where the monks and laymen could walk. Flamboyant- Refers to the late Gothic period (in France from the late 14th century) which favored decorative curves and reverse curves resembling flames. ----- Saint Michael Saint Michel- Saint Michel of the summits The mount, dedicated to Saint Michael in 708, was, with Mount Gargan in Southern Italy, one of the principal places of worship consecrated to the archangel in the West. Devotion to Saint Michael had a very special significance in medieval religious life. The archangel Michael had three tasks - he weighed souls in order to separate them into the elect and the damned, he lead them to heaven protecting them against lurking demons and lastly, he guarded the gates of Paradise. Thus peaks close to heaven, such as Saint-Michel-de-l'Aiguilhe in Puy and Saint-Michel-de-Cuxa in the Pyrenees, were often consecrated to him, and high chapels above the entrances to a number of important churches were dedicated to him, like Tournus, Vezelay, and Saint-Benoit-sur-Loire. In the 15th century, worship of the archangel acquired a new importance with the creation of the Order of Saint Michael. The 19th century rediscovered the Middle Ages, as the Fremiet statue, erected on the top of the spire in 1897, bears witness. It consists of 8399 characters in all. There are 1526 space(s), 214 punctuation mark(s), 85 digit(s), and 1375 word(s) The letters in descending order by their occurrences are shown below: e occurred 871 times t occurred 663 times a occurred 480 times o occurred 473 times i occurred 457 times r occurred 435 times n occurred 432 times h occurred 418 times s occurred 405 times l occurred 285 times c occurred 261 times d occurred 213 times u occurred 190 times m occurred 180 times f occurred 161 times g occurred 112 times w occurred 108 times p occurred 101 times y occurred 97 times b occurred 96 times v occurred 76 times k occurred 31 times q occurred 15 times j occurred 6 times x occurred 5 times z occurred 3 times The file, english.txt, is written in English.

Submitting the Program

To submit your program, type the following command at the Unix prompt

submit cs201 Proj4 followed by the .c and .h files necessary for compilation

To verify that your project was submitted, you can execute the following command at the Unix prompt. It will show all files that you submitted in a format similar to the Unix 'ls' command.

submitls cs201 Proj4

CSEE | 201 | 201 S'02 | lectures | news | help

Sunday, 14-Apr-2002 16:07:32 EDT