Compile your own programs from Source

This week we are going to investigate how compiling applications under Linux has grown. We will first have a brief overview of what it is like to compile programs the old fashioned way, and then we will see how the newer application have made the process much more fool proof. In addition to learning how to compile, we will have a look at a few of the elements in a C source file.

The idea of this talk is not to make you C programmers, but to show you how the Linux tool box was shaped by the C programing language and how it uses a system approach to overcome difficulties.

Outline

• Background What is the philosophy behind compiling from source.
• How to Compile Single files Manually This would be appropriate for a Hello World type program.
• Include/Header Files A little internals to C files so the ideas make sense.
• Command line compiling Lets start by understanding how to use the command line to compile a C program.
• The Make file The beginning of automation in the build process.
• How to compile Multi-file compiles with Make Now we add some more complexity to the build.
• Why compiling was a problem in the past, a little History Why compiling from source code files was so complex.
• The Linux File System Standard A method to the madness of directory structures.
• The Configure Script The beginning of configuration automation in the source world.
• Compile of WebMagick from Source Lets look at the steps needed to compile a modern program under Linux.
• Conclusion So what have we learned?
• A Perl solution to compiling packages Now that we understand how to build packages from source code, lets see how one application has taken this even further.
• How about an example I will show you how to add a Perl module to your distribution.

Background

Since the Unix system was designed as a programmers system, it was assumed that the users were familiar with how to create and compile programs. Now that Linux is moving more into the main stream, many of the new users do not come from a Computer Science background. This presents an interesting challenge to the authors of new applications. Basically they either need to make the programs available to a wider non technical audience, or limit the users to the few who know how to work directly with the code.

Not so long ago, it was assumed that everyone who installed a Linux system knew how to recompile the Kernel. When you installed a new system, you were expected to select the hardware on you system from the provided drivers lists, and recompile the kernel for your system including only the drivers necessary to run your computer. The reason this was done was to improve the performance of the system by tuning the kernel to support the needed driver, not everything the Distribution included. This was needed since all the drivers lived in the kernel all the time. So reducing the number of drivers from a few hundred, to a dozen was a big improvement in performance. It would make the kernel considerably smaller, thus faster to load, and consuming less memory to allow more ram for running programs.

How to Compile Single files Manually

Lets start out by showing how to compile a simple C file and then add more complexity. The idea is not to make you a programmer, but to give you a better understanding of what the new tools are doing for you.

Lets start with a look at a Fortune cookie program as a example. Here is the directory listing of the source directory for this application.

      -rw-r--r--    1 john     wheel        5415 Apr 24  1995 fortune.6
-rw-r--r--    1 john     wheel       30443 Apr 24  1995 fortune.c
-rw-r--r--    1 john     wheel         157 Apr 24  1995 Makefile
-rw-r--r--    1 john     wheel        1980 Apr 24  1995 pathnames.h

The first file is the man page, fortune.6, so we will ignore that file for now. Next is the file fortune.c which is the source code for the application. The file, Makefile, is a script directive we will come back to later. The file pathnames.h is a header file pointing to the location holding the fortune cookie data files.

In the C programing language the author can use symbols to represent values. Lets take a look at the include file above named “pathnames.h”. It includes a long comment at the beginning, but we are interested in the line which says:

    #define     FORTDIR         "/usr/share/fortune"

What this tells the program fortune.c is where to look for the data files used by the fortune program. In C programming it is common to configure your program to read these type of values from another file instead of placing them inside your program. This has several advantages. First it means that each user does not need to modify the main program file. All needed changes can be accomplished in the header files. A second advantage is that the same variable can be used in many file, but it is set only once. If we had many C files to compile, they could all read the same header file to get their options.

OK why should we care about this. You will see later how this ability to read information from other files will aid in automating the build process.

Command line compiling

For this example we could just use the command: gcc fortune.c -o fortune which would compile the fortune program and leave the results in a file named fortune. Fairly simple so far. We could then copy the file fortune to /usr/local/bin, load the data file into /usr/share/fortune, and we are done.

The Make file

Now lets look at the Makefile included with fortune. Make is designed to issue the correct commands to compile the source files. It has the additional benefit that it will skip source files which are older than their object files. This becomes useful when you have many source files and you only modify a few of them. Make will only recompile the ones that changed. Lets take a look at the contents of this make file.

    CFLAGS=$(O) -I../strfile fortune: fortune.o install: install fortune /usr/bin/ install -m 644 fortune.6 /usr/man/man6/ clean: rm -f fortune.o fortune The first line sets an option for the compile specifying that it should include the directory ../strfile in the compile. The next line says that to produce the file fortune you need to create the object file fortune.o . The next option is to install the fortune program, it uses the install program to move fortune to /usr/bin and the man page to /usr/man/man6. Lets try issuing the command man -n in this directory to see what it produces. The option -n says to display the commands, but do not run them.  cc -I../strfile -c -o fortune.o fortune.c cc fortune.o -o fortune The command line says call the C compiler, cc, have it include the directory ../strfile, compile fortune.c to the object file fortune.o . The next line say link the object file fortune.o to create the output file fortune. How to compile Multi-file compiles with Make In the example above we saw how we could use Make to compile and install a single file. But remember it included the directory strfile. If you look at that you will find another program there, strfile. This program strfile, is a preprocessor for the fortune cookies. It prepares the input files for use by the fortune program. If you checked the directory ../strfile, you would find a structure similar to fortune above. But we need both these programs to get the fortune cookie program working. The answer to this question is a Makefile above each of these directories. So lets have a look at the Makefile contents which sits in the directory above the fortune directory.  SUBDIRS=strfile unstr fortune datfiles flags="O=-O2 -fomit-frame-pointer -pipe" LDFLAGS=-s all: for i in${SUBDIRS}; do make -C $$i {flags}; done install: for i in {SUBDIRS}; do make install -C$$i ${flags}; done clean: for i in${SUBDIRS}; do make clean -C i \${flags}; done

This make file defines a set of directories in SUBDIRS, and then calls make in each of these directories to do the work. So you now have a recursive make which will compile all the programs and process all the files.

Why compiling was a problem in the past, a little History

There was a time when it was difficult to compile a program from one version of Unix on another version. So if you wrote you program on AT&T System V and tried to compile it under Solarus, you would have a number of problems. The problem came from the inclusion of different libraries as well as different compiler options on the other versions of Unix. The way this was dealt with was either create different versions for the different systems, or create a series of #ifdef statements in your code to change the definitions based on your system type. This became a maintenance nightmare for developers. When Linux first got started there was concern that the same situation would happen between different distributions of Linux. This was headed off by two issues. The first was a common Filesystem definition, sometimes known Linux Standard Filesystem.

The Linux File System Standard

Quote from The Linux File System Standard by Garrett D’Amore

Many of us in the Linux community have come to take for granted the existence of great books on Linux like those being produced by the Linux Documentation Project. We are used to having various packages taken from different Linux FTP sites and distribution CD-ROMs integrate together smoothly. We have come to accept that we all know where critical files like mount can be found on any machine running Linux. We also take for granted CD-ROM-based distributions that can be run directly from the CD, with only a small amount of physical hard disk or a RAM disk for some variable files like /etc/passwd, etc.

This has not always been the case. Many of us remember Linux from the days of the SLS, TAMU, and MCC Interim distributions of Linux. In those days, each distributor had his own favorite scheme for locating files in the directory hierarchy. (Actually, some can go back further, back to the days when the boot and root disks were the only means of getting Linux installed on your hard drive, but I have not been a member of the Linux community quite that long.) Unfortunately, this caused many problems when dealing with different distributions.*

The advantage that came with this were common locations for the executables, libraries, and user files. This option aided developers because they did not need to look in as many places to find the Libraries, and include files. This also made it easier for installations since there were fewer places to put the files. This is not to say that everyone puts their files in the same places, but in general they stick to the standard.

So how does this impact the programmers. Well it means that it is common to look in a few known places for the include files and libraries. This simplifies things on one level.

Another problem was knowing which libraries and programs were available on a given system. This problem was addressed by a program called configure.

The Configure Script

The configure program was designed to allow the programmer to have many of the options build for him/her before starting the compile. So what is this magic file? Well it is often a shell script although not always. It has the job of figuring out your system and defining values for a header file. This can be done by detecting the files on it’s own using tests. It will discover the location of the system include file and library files. The other part of the configure consists of asking he user questions. Generally theses questions have default answers, so you can just press enter to accept the default values.

Lets have a look at what it would look like running the configure program for the application WebMagick. Here is part of what is displayed when you execute the configure script that is part of the package.

    checking whether build environment is sane... yes
checking for a BSD compatible install... /usr/bin/install -c
checking for perl... /usr/bin/perl
checking for RGB database... /usr/X11R6/lib/X11/rgb.txt
checking for xlsfonts... /usr/X11R6/bin/xlsfonts
checking for default font... /usr/X11R6/bin/xlsfonts:  unable to open display ''
usage:  /usr/X11R6/bin/xlsfonts [-options] [-fn pattern]
where options include:
-l[l[l]]                 give long info about each font
-m                       give character min and max bounds
-C                       force columns
-1                       force single column
-u                       keep output unsorted
-o                       use OpenFont/QueryFont instead of ListFonts
-w width                 maximum width for multiple columns
-n columns               number of columns if multi column
-display displayname     X server to contact

fixed
checking for server root path...

In order to map paths into URLs, WebMagick must know the path to your
"http://yourdomain/" and this URL resolves to "/usr/local/share/htdocs"
on the system, then the appropriate answer is "/usr/local/share/htdocs".

What is the path to your servers root?
/var/www/html
/var/www/html
checking for WebMagick icon path... 

The beginning lines which say checking are scripts looking for programs available on your computer. The line which begin usage: is an error message since the program xlsfonts was called when not in an a graphic display. The line which begins In order to map … is a prompt for to the user to

Once the configure file finishes it’s checking it finishes with:

    updating cache ./config.cache
creating ./config.status
creating Makefile
creating doc/Makefile
creating doc/fig/Makefile
creating icons/Makefile
creating utils/Makefile
creating webmagick
creating webmagickrc

It has now created all the Makefiles as well as the webmagick file itself since it is a Perl script. To complete the compile all we need to do is issue the commands make and make install .

Compile of WebMagick from Source

1. Change to convenient location to perform the compile. I normally use /usr/src.
2. Unarchive the package using Tar. tar xzvf /tmp/WebMagick-2.02.tar.gz .
3. Change to the top level directory, cd WebMagick-2.02.
4. Run the configure script to setup the compile. ./configure .
5. Compile the program with make. make
6. Install the program with make. make install . This step is best done as root.

At this point you are done. If you want to reclaim the space you could delete the compile directory with the command rm -fr /usr/src/WebMagick-2.02 . That’s it.

Conclusion

With many programs these days it is easier than it seems. But be warned, not everything is roses. There are still times when you can not get the source to compile smoothly, or it is still missing something important.

When you can not compile the source files, you need to try to find a package already compiled for your distribution. Lets look at some sources for already compiled applications. I will assume you are using a Redhat distribution, since this part of the talk will change if you are using, Debian, Gentoo, or Slackware distributions.

• Help from a FAQ Many applications have additional information on their web site on how to compile the sources. Look for an FAQ or a mail list archive.
• Try Google If you use the Google search engine, you will notice a tab labeled Groups. This is a search engine for email lists. If I was looking for help compiling WebMagick on Redhat 9.0, I might try to use a group search something like: WebMagick compile problem Redhat 9.0 . Sometimes by doing some looking you can find an answer. If not you might be able to post a question for someone else to answer.
• Is there a Package? When ever I do a new installation of Redhat, I create an index file of the packages. If I have the first install CD mounted under /mnt/cdrom I use the command rpm -qilp /mnt/cdrom/RedHat/RPMS/*.rpm | tee -a cdrom.package.list.txt . This goes through each RPM file and copies the information and the file names of each package to a file. This allows me to search the list for a specific application before searching else where.
• www.rpmfind.net This web site lists RPM packages for many applications. You can search there for many applications and find an RPM file to install. The only gotcha about this is sometimes you can not find a version for your distribution. But you could download the src.rpm file and try to compile it locally with the command: rpmbuild –rebuild .

One of the real strengths of Linux is it’s ability to allow you to grow the system to suit your fancy. The ability to add applications to our existing distribution greatly broadens our horizons.

A Perl solution to compiling packages

We have been discussing compiling mostly C source files. What about Perl, it has many modules to enhance your system. First of all where to start: Comprehensive Perl Archive Network . This web site contains hundreds if not thousands of modules which can be added to your local Perl installation. Now you could download the Perl package, untar it and compile it very much as we saw above with C source code files.

Yes, I know Perl is an interpreted language, so why do we need to compile new packages?

The answer is that modules need some customization to be placed on your system. Additionally a new module may need another module installed before it will work correctly. So the programmers involved with CPAN created an interactive method of adding modules to your system.

The key to this magic is a module called CPAN. If you have an internet connection you can use Perl to install new modules on itself.

I decided I wanted a module called Date::Convert . So as root I did used the CPAN module to do the work. Here is what it looked like:

    998 [root:v0/] /root
# perl -MCPAN -e shell

cpan shell -- CPAN exploration and modules installation (v1.61)
ReadLine support available (try 'install Bundle::CPAN')

cpan> make Date::Convert
Database was generated on Wed, 28 Jan 2004 07:55:47 GMT
Fetching with LWP:
ftp://cpan.nas.nasa.gov/pub/perl/CPAN/authors/01mailrc.txt.gz
Fetching with LWP:
ftp://cpan.nas.nasa.gov/pub/perl/CPAN/modules/02packages.details.txt.gz
Database was generated on Mon, 09 Feb 2004 08:51:13 GMT

There's a new CPAN.pm version (v1.76) available!
[Current version is v1.61]
You might want to try
install Bundle::CPAN
without quitting the current session. It should be a seamless upgrade
while we are running...

Fetching with LWP:
ftp://cpan.nas.nasa.gov/pub/perl/CPAN/modules/03modlist.data.gz
Running make for module Date::Convert
Running make for M/MO/MORTY/DateConvert-0.16.tar.gz
Fetching with LWP:
ftp://cpan.nas.nasa.gov/pub/perl/CPAN/authors/id/M/MO/MORTY/DateConvert-0.16.tar.gz
Fetching with LWP:
ftp://cpan.nas.nasa.gov/pub/perl/CPAN/authors/id/M/MO/MORTY/CHECKSUMS
Checksum for /root/.cpan/sources/authors/id/M/MO/MORTY/DateConvert-0.16.tar.gz ok
Scanning cache /root/.cpan/build for sizes
DateConvert-0.16/
DateConvert-0.16/Convert.pm
DateConvert-0.16/t/
DateConvert-0.16/t/identity.t
DateConvert-0.16/t/heb_and_greg.t
DateConvert-0.16/t/julian.t
DateConvert-0.16/t/hebrew.t
DateConvert-0.16/t/absolute.t
DateConvert-0.16/Makefile.PL
DateConvert-0.16/INSTALL
DateConvert-0.16/TO-DO
DateConvert-0.16/CHANGES

CPAN.pm: Going to build M/MO/MORTY/DateConvert-0.16.tar.gz

Writing Makefile for Date::Convert
cp Convert.pm blib/lib/Date/Convert.pm
Manifying blib/man3/Date::Convert.3pm
/usr/bin/make  -- OK

cpan> test Date::Convert
Running test for module Date::Convert
Running make for M/MO/MORTY/DateConvert-0.16.tar.gz
Is already unwrapped into directory /root/.cpan/build/DateConvert-0.16
Has already been processed within this session
Running make test
PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
t/absolute........ok
t/heb_and_greg....ok
t/hebrew..........ok
t/identity........ok
t/julian..........ok
All tests successful.
Files=5, Tests=159,  1 wallclock secs ( 0.35 cusr +  0.03 csys =  0.38 CPU)
/usr/bin/make test -- OK

cpan> install Date::Convert
Running install for module Date::Convert
Running make for M/MO/MORTY/DateConvert-0.16.tar.gz
Is already unwrapped into directory /root/.cpan/build/DateConvert-0.16
Has already been processed within this session
Running make test
Prepending /root/.cpan/build/DateConvert-0.16/blib/arch /root/.cpan/build/DateConvert-0.16/blib/lib to PERL5LIB.
PERL_DL_NONLAZY=1 /usr/bin/perl "-MExtUtils::Command::MM" "-e" "test_harness(0, 'blib/lib', 'blib/arch')" t/*.t
t/absolute........ok
t/heb_and_greg....ok
t/hebrew..........ok
t/identity........ok
t/julian..........ok
All tests successful.
Files=5, Tests=159,  1 wallclock secs ( 0.35 cusr +  0.04 csys =  0.39 CPU)
/usr/bin/make test -- OK
Running make install
Installing /usr/lib/perl5/site_perl/5.8.0/Date/Convert.pm
Installing /usr/share/man/man3/Date::Convert.3pm
/usr/bin/make install  -- OK

cpan> exit
Lockfile removed.
999 [root:v0/] /root
# 

Lets walk through this and see what happened.

1. In a terminal window, I ran the original command perl -MCPAN -e shell which started an interactive shell with the CPAN site.
2. Next I issued the command: make Date::Convert . This command started a process to find the module and compile it on my system. It finds the module, downloads it, unpacks it and then builds it.
3. Now I think I will test the new module before installing it. So I issue the command: test Date::Convert . The package is now put through a set of tests included in the module to assure you that it work correctly. At the end it reported All tests successful. .
4. Finally it is time to install the package into my system. The command, and you probably guessed was just: install Date::Convert . The program first checked that the package had been built. Then it reran the package test again to be sure it was OK. Finally it installed the package in the correct location for this distribution. Last but not least it updated it’s local information about what packages are included in the system.
5. OK, I was done so I typed: exit to finish the session.

Now I ask you, was that easy or was that easy. I included this example to show you how far the automatic installation of source code could be automated. In fact if my first command inside the CPAN shell had just been install Date::Convert , the entire operation would have been accomplished by a single command.

This same type of automated build process is used in the Distribution Gentoo. They install a core system on you computer and then use the internet to build the rest just for your computer. So you see the Linux system has gotten quite sophisticated in it’s use of it’s own tools to maintain the system.

One additional thing to realize is that this ability to make the process more and more automated is possible due to the tool box nature of the system. This automation was accomplished by incremental changes to the applications. The magic of this is the use of the tools and script capabilities inherent in the Operating System.

Written by John F. Moore

Last Revised: Mon Jan 16 15:37:34 EST 2017