New2Unix; The Beginner's Introduction to Working with Unix/Linux Systems

by starX

1. Introduction
1.1. What is Unix?!?!
1.2. Getting to the Command Line
1.3. Logging into the System
1.3.1. Avoid the root Account(!!!!!!!!!)

2. Welcome to your New Home (Directory)
2.1. Setting Your Home Directory Permissions
2.1.1. Using the chmod Command
2.2. Making a New Directory with mkdir
2.3. The Directory Tree
2.3.1. Moving around the Directory Tree with cd
2.4. Listing Directory Contents with ls
2.4.1. List Everything in a Directory
2.4.2. Get Detailed Information about a Directory's Contents
2.4.3. Knowing the File Type at a Glance
2.5. Showing your Current Location with pwd
2.6. Removing Directories with rmdir
2.7. Recap

3. Running Programs
3.1. What is a Unix Program?
3.2. The Execution Path
3.3. Running Programs that are in Your $PATH
3.4. Running Programs that are not in Your $PATH
3.5. How do I Find Out What's in My $PATH???
3.6. Adding New Entries to your $PATH

4. Writing and Editing Text
4.1. Editing Files with vi.
4.1.1. vi's Modes
4.1.2. Saving your Files in vi
4.1.3. Deleting (Cutting) Text in vi
4.1.3.1. Deleting Complete Lines of Text
4.1.3.2. Deleting Individual Words of Text
4.1.3.3. Deleting by Individual Characters
4.1.4. Copying (Yanking) Text in vi
4.1.4.1. Copying Complete Lines of Text
4.1.4.2. Copyng Individual Words of Text
4.1.4.3. Copying Individual Characters
4.1.5. Pasting from the Temporary Buffer into Your Text
4.1.6. Searching Your Document for Text
4.1.6.1. Searching Your Document from the Top Down
4.1.6.2. Searching Your Document from the Bottom Up
4.1.6.3. Navigating in Search Mode
4.1.7. Quitting vi
4.1.8. Recap
4.2. Editing Files With Emacs
4.2.1. Entering Emacs Commands
4.2.2. Saving Files in Emacs
4.2.3. Deleting Text in Emacs
4.2.4. Deleting (Cutting) Blocks of Text
4.2.5. Copying a Block of Text
4.2.6. Pasting Text into Your Document
4.2.7. Searching your Document for Text
4.2.7.1. Searching from the Top Down
4.2.7.2. Searching from the Bottom Up
4.2.7.3. Navigating in Search Mode
4.2.8. Quitting Emacs
4.2.9. Recap

5. Fun with Files
5.1. Viewing a File with more
5.1.1. Navigating more
5.1.2. Searching for Text in more
5.1.3. A Couple more Tricks (Ha ha, get it?)
5.2. Moving a File with mv
5.3. Renaming files
5.4. Copying your Files
5.5. Deleting Files
5.6. Printing Files
5.7. Linking files to One Another
5.7.1. Dissecting a File
5.7.2. Making a Hard Link
5.7.3. Making a Symlink (Softlink)

6. Seek and Ye Shall Find
6.1. Searching the Directory Tree
6.2. Searching your Files for Text
6.3. Whirlwind Introduction to Regular Expressions

7. Network and Internet Utilities
7.1. Checking a Host with ping
7.2. Transfering Files using FTP
7.2.1. The Standard FTP Client
7.2.2. Using ncftp
7.3. Connecting to a Remote Computer with Telnet
7.4. Using the Secure Shell (ssh)
7.5. Web Browsing on the Command Line (?!?)

8. Working with File Compression
8.1. Grouping Files with tar
8.1.1. Creating a tar Archive
8.1.2. Appending to a tar Archive
8.1.3. Viewing the Contents of a tar Archive
8.1.4. Unpacking a tar Archive
8.2. The Compression Factor
8.2.1. Compresing a File with gzip
8.2.2. Decompressing a File with gunzip

9. Wrap-up

10. License

11. Refferences

12. Errata

Copyright (c) 2003 starX. Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled "GNU Free Documentation License".

1. Introduction

You're reading this, which means you need help. For one reason or another, you've just been plopped in front of an operating system that you haven't the faintest idea how to use. Very little looks familiar, and you're sure that your professor/employer is out to get you, or that maybe this venture is one of the worst ideas you've had, and maybe it's time to pack the whole thing up and forget it. Am I about right here? Fear not, dear reader, while the task before you seems daunting right now, let me assure you that you too can learn to use Unix and Unix like operating systems!

Top

1.1. What is Unix?!?!

Unix is an operating system with a long (as far as anything in the computing world goes) and noble pedigree. Born in the Bell Labs in the 1960s, Unix is an OS that has stood the test of time, and is still the most popular platform for web servers and developers today. Ever since the release of Linux, a free Unix like operating system, more and more folks outside of the academic and scientific disciplines have started using Unix (which I will use here to describe Unix and all of its cousins), which does open up a certain problem. Unix is designed to be simple to use rather than simple to learn, and powerful rather than pretty. While your distribution no doubt has a nice fancy graphical user interface attached to it that may remind you of that other, fascist operating system that you probably had to pay for whether or not you wanted, the real power of a Unix like operating system can be found on the command line.

Top

1.2. Getting to the Command Line

I heartily recommend that, before you start playing around with anything pretty and graphical, you start using your Unix/Linux machine from the command line. The reason being that anything that can be done in a Unix like operating system can be done most effectively through the command line (well, just about everything). As a result, if you really are going to learn Unix, and learn to "think Unix," then you are going to have to learn to think in terms of using your command line interface. As such, this tutorial will be based around the idea that you want to learn how to use the command line, and so that is what I'm going to teach. If you would rather do things from the GUI perspective from the get-go, then now is a good time to find another tutorial. Still with me? Good.

Unfortunately, the wonder that is the command line may be hidden from you. Modern Linux distributions are all to eager to force the boot to graphical mode option on you, and if you're not used to the idea of Unix or Linux, I'm sure that sounded like a wonderful idea at the time. Now the problem exists of finding your way to a command line interface.

Most likely, you will be able to open up a command window right in your GUI. If you go to your task menu and look in the system folder, you should see something called "shell" (note that it may be in another folder depending on your distribution). If you start that program, it should open up a black window on your screen. Congratulations, you've found a command prompt window! You're all set to learn now.

If this process has not been quite so painless, then try pressing your CTRL-ALT-F2 keys simultaneously. This should make the screen flicker a bit, and will display a login prompt. Login here (enter your user name and password), and congratulations, you've found a command prompt! You're all set to learn now too, but should you need to get back to your windowing environment, press CTRL-ALT-F7; that should do the trick.

One other possible scenario is that you are accessing the Unix machine from another computer (i.e. through telnet or ssh). If that is the case, as soon as you have gotten yourself logged in, you will be at a command prompt, and will likewise be all set to learn some stuff. Shall we proceed then...

Top

1.3. Logging into the System

I've already mentioned this logging in thing once or twice before, but I think it bears a little bit of explanation for those that are truly new to the idea. Unix is a multi-user system, which is a fancy way of saying that it supports a lot of different people using the system in a way that keeps them independent. Everyone with a user account has their own home directory in which they can keep their files, and everyone has the ability to run different programs dependent on their security level. If you don't have a username and a password for the system you're working on, you can go no further.

Top

1.3.1. Avoid the root Account(!!!!!!!!!)

If you haven't installed Unix/Linux on your own computer and have an account on someone else's machine, you can skip this part.

The first account that was created on your system is the root account. Do not use this account dare I say it... EVER!!! The root account is a very special account that will let you do anything on the system, and if you're not used to working with Unix and thinking in Unix command line thoughts, then this is a very bad thing. It provides you with all of the power to delete every single file on your computer with a few keystrokes, and on a more modest level, will allow you to delete the absolutely essential ones. The only thing you should do with your root account right now is create yourself another user account, one that will only let you screw up non-critical files. If you really botch anything we do here, you can always delete that user account, and create it again from scratch. No mess, hardly any fuss.

Creating a regular user account on your system shouldn't be too difficult. Presuming you have one of the more common Linux distros up and running: useradd username, where username is whatever name you're giving the account, should be sufficient to create the account. To set the password for the account, use passwd username; you will then be prompted to enter a password for the username you have selected. Now logout of your root account, and forget about it for a while. When you can look back at this tutorial and say "Pshaw, how simplistic are the things of which starX writes!" you will be ready to use root. Now log into your system as a mortal user, and get ready to learn how the thing works.

Top

2. Welcome to your New Home (Directory)

Home sweet home directory. This is your personal file space on the system. Here you can create directories, create files, create programs, run those programs (as long as they don't try to do anything to the system), delete it all for spring cleaning, or just hang out and take a look around saying "Yeah... this is all mine."

Right now, presuming you've logged yourself into the system without a problem, you're probably looking at something that looks a lot like this: bash$ That is your command line interface waiting for input. In a moment, we're going to talk about some of the basic things you can do here, but first I do believe a quick definition is in order.

A directory is the Unix/Linux equivalence of what is called a folder on some other operating systems. It is just a handy way of grouping a bunch of related files (and other folders/directories) that have something in common together. In the case of your home directory the unifying factor is that your own all of the files and directories contained therein. This is crucially important to the way Unix/Linux security works; everything in your home directory is yours, and you control whether people can read from it, write to it, or execute any programs found in there (except for the root account, which can do anything it wants all the time).

Top

2.1. Setting Your Home Directory Permissions

This is one of those first things to understand and do. Each file and directory in Unix and related operating systems has a set of permissions associated with it. These permissions, called a "mode" define what access is allowed for various types of users; yourself, others in your access group, and any other user on the system.

The First and last of these should be fairly self-descriptive, but a note on groups: these can be quite specific or quite arbitrary. Your company may put all individuals working on a specific project into a group. Your university might have all students who signed up for accounts on the system in the same year in the same group. Maybe all non administrator users are in the same group. The point is that, unless you are expressly told that you need to be giving any kind of access to a certain set of people, it's generally best to think of people in the same group as having the same rights as any other user on the system. If you can trust anyone who may be logged in to read, write, delete, etc. in your home directory, then you may as well trust those in your group to do the same thing. Otherwise, don't trust those in your group any farther than you would any other user.

The mode of a directory that defines these permissions is in the form of a three digit number (well.... not quite, but thinking of it as such is adequate for now). Each of the place holders in this number is used to represent yourself, your group, and any other user (in that order). To recap: your permissions are defined in the 100s place, your group members' permissions are defined in the 10s place, and any other user's permissions are defined in the 1s place.

To further complicate matters, permissions are expressed in the form of total numerical values. Read permission is indicated by a 4, write permission by a 2, and execute (run programs) by a 1. That means that if a value is set to 6, read and write permission has been granted, but others cannot execute programs in that directory. A permission value of 5 means that files can be read from and executed in that directory, but nothing can be written by the specified class of user.

Putting this all together, you most likely want to be able to do anything you please in your home directory, so you would want to specify yourself as having permission 7. In all likelihood, if there really are other users on your system, you don't want them poking around your files. Thus, you would want them (group and others) to have permission of 0. Thus, the permission mode you would want to set for your home directory is 700.

Top

2.1.1. Using the chmod Command

So now that you know all about what this whole permission mode thing is, and what mode you want to set your home directory to, we need to fire up a command to actually do it. For this purpose, we will use the chmod command, which changes the mode of the directory and its contents. To properly protect your home directory, issue the following command at the prompt: chmod 700 ./

Of course, there may come a time when you want to change a directory's permission in the future. You can issue this command at any time, and eventually, when you have more directories built inside of your home directory, it becomes a good idea to set them all as private every now and again. This can be achieved with the chmod -R 700 ./ command, which will recursively set everything in your home directory and any of its subdirectories (namely all YOUR files and folders) as having full access for you, and no access for anyone else.

Need more help? Like I mentioned above, you can use man chmod for the full manual page, and like most Unix commands, issuing chmod --help will print a short list of the options available.

Top

2.2. Making a New Directory with mkdir

If you only had one directory with which to stash your stuff, things would get messy really fast. Well, okay, so they do anyway. In your graphical system, I'm sure you have icons aplenty cluttering up your desktop. But a real problem with the CLI is that if you have too much stuff in a directory then you won't be able to view the entire contents on the screen when you list them. For this reason, use sub directories, and use them often. Do spring cleaning when you need it. It keeps you organized and helps prevent files from getting lost unnecessarily.

By way of expansion and definition, a subdirectory is any directory that is contained within another directory. These directories are also sometimes refereed to as child directories and as leaf directories. Everything on the system, including your home directory, is a subdirectory of the root (/) directory. Just like your home directory will be the root directory for all of your files and folders. I'll talk more about this when I get to navigating the directory tree, but for now just remember that your home directory is the starting point for all of your files and folders, and you'll be fine.

To create a new directory, you issue the mkdir directory_name command, where directory_name is the name that you want to give to the new directory. So if you want to create a new directory called "bob" within your home directory, use mkdir bob at the prompt. It is as simple as that.

While mkdir can take a few other options, they probably aren't anything you'll need to concern yourself with unless your writing shell scripts (a little bit beyond you if you're reading this, but there's another tutorial for that), so I'm going to ignore them here, but check them out using some of the above methods if you're interested.

Top

2.3. The Directory Tree

So now you have a child directory all set and ready to go. This allows us to play around with the oh so important ability to move around the directory tree. Before I promised a bit more by way of explanation about all this directory tree and root directory talk, and this is an ideal time. If you don't completely understand it here, try playing around with it a little bit in practice, I think it'll make more sense.

If you imagine the directory structure of a Unix like operating system as a tree, it is a lot easier to understand what's going on. At the very base of it all, you have a root directory, which in Unix parlance is represented by a / (forward slash) character. This is the base directory for the entire system. Every other file and directory is a child of root, and likewise we say that root is ultimately the parent directory for the entire system. Thus, every directory besides the root directory has a parent directory.

The concept of the parent directory is so common that there is a special way of referring to it: .. (two dots). No matter where you are in the directory tree, you can always refer to the parent directory of the directory you are in by using the .. sequence, just as a single dot will always refer to the current directory (see chmod example above; we created a new directory in the current directory).

Every time we create a new directory, think of it as branching off from the current directory in the way that branches spread out from a tree. Branches spawn off of new branches. Eventually, those branches spawn off leaves. In Unix directory tree parlance, we call a file a leaf node because it can't branch off into something new. Technically, any branch that spawns no children is referred to as a leaf, and therefore an empty directory is also a leaf.

Top

2.3.1. Moving around the Directory Tree with cd

The way to move around the directory tree in Unix is by using the cd (change directory) command. The usage is most simply defined as: cd directory where directory is the directory that we are going to move to. Although remember that move is a completely relative term here; it might be more accurate to say that cd will set the current directory to whatever directory name we specify.

Remember the special ways of describing the parent directory above? The cd command will accept those. Say you issued a cd bob to get to your bob directory? Instead of having to enter the full directory path, we can simply type cd .. in its place. Of course the real beauty of this only becomes apparent when you consider that otherwise we would (most likely, the specifics depend on which Unix variant you're using) have to type cd /home/ourusername/. Big deal you say? Someday, young one, you will be buried deep within your directory tree, will need to move up one or two levels, and will be grateful.

Did you get lost in your systems directory tree? No problem. If you type cd all by itself at the prompt, it will default to your home directory in the absence of any other arguments, and take you back there. Remember, just use cd without any arguments when you get lost.

Top

2.4. Listing Directory Contents with ls

Of course it also helps if you are able to view the contents of your directories. This is achieved by using the ls command. The basic syntax goes a little like this: ls directory where directory is the directory that we want to view. If you don't include a directory name, then ls will list the contents of the current directory. Running this command will output the contents of the directory to your screen (or as much of it as will fit).

The ls command has a hella lot of options, some of which will likely be more useful to you than others. I'm not going to discuss all of these options here, but I will show you how to use (in my experience) some of the more useful ones.

Top

2.4.1. List Everything in a Directory

A casual use of the ls command will very rarely turn up everything that you have stashed away in a directory (especially your home directory). You may have any number of hidden files and directories (usually these are used for configuration and customization purposes by various programs), and generally you'll probably want to ignore these and leave them alone, but it may be handy to view all of the contents in case one of these files isn't found by a program, and you need to verify that it actually isn't there, rather than being corrupted, or just plain mis-spelled.

To list the complete contents of a directory use: ls -a directory

Top

2.4.2. Get Detailed Information about a Directory's Contents

A lot of the time you're going to want to know more about a file than whether or not it is simply there.

To view detailed information about the contents of a directory: ls -l directory

This will produce a multi-columned display of information about each item in the directory you've specified, which I will explain below.

Top

2.4.3. Knowing the File Type at a Glance

One of the more confusing things (initially) about Unix and its cousins is that everything in the directory tree is a file, whether its a directory, a human readable file (like a text file), an executable file, or a link to another file (more on these later). Sometimes it helps to see what kind of file you're looking at at a glance. Using the ls -F directory command, you will get a character appended to the end of a file that indicates its type.

There are a few other types of file in the system, and hence there are a few other types of characters you might see appended, but chances are you won't come across them for some time, and when you do, you'll know what they are and what they do. If you do run across something that is unfamiliar, it's probably best to leave it alone for now... unless you like running your system installer that is.

Top

2.5. Showing your Current Location with pwd

If you ever get lost in your directory tree and actually need to know where you are (rather than just getting back to your home dir), you can use the pwd (print working directory) command to display the complete path of the directory that you are currently working in. By path I mean the list of parent directories, starting with the root directory on the left, and going through a list of all of the child directories (of root) that it takes to reach your current location.

The syntax for the pwd command is very strait forward; just type pwd at the prompt.

Top

2.6. Removing Directories with rmdir

Consider this the first part of the section on deleting things (the second part will follow when we actually have some files in our directory to work with). So far I've only shown you how to make a directory, and aside from any default config files in your home directory, you shouldn't have any files kicking around in any of your directories. If you should ever feel the need to delete one of these directories, there is a command that is ideal for the situation: rmdir.

The syntax for rmdir is rmdir directory where directory is the name of an empty directory that we are going to delete from the tree. Note that if you ever have anything in the directory, rmdir will not function. It requires you to have deleted everything within the directory yourself before you are able to delete the directory, which provides a safe guard against accidentally blowing away years of work.

Top

2.7. Recap

For now those are the operations that it probably behooves you the most to know for working with and managing directories. Check out their man pages and help options for more info, and try to remember the golden rule: when in doubt, don't touch it. Seek the wisdom of a higher power (see the Getting Help section below).

To review the commands that we've discussed in this section...

Top

3. Running Programs

The flesh and bone of an operating system are the programs that it runs, and Unix is no exception to this rule. So far, we've been running programs that print the directory contents to the screen, programs that change what directory we're in, programs that make new directories, etc. But we haven't really explored what goes into running a program much beyond something as simplistic as "type X at the command line." That changes here.

While in many cases you will be able to run a program by simply typing its name on the command line, sometimes you will need some variation of this, and I will here try to describe not only what makes a file a program, but also how you run a program may depend on your location in the system, and what some of your configurations look like.

Top

3.1. What is a Unix Program?

At its simplest definition, a Unix program is a file that has been given executable permissions. Remember that what files can and can't do in Unix is completely dependent upon the permissions that the file has been granted (see chmod). Once an executable flag has been set on a file (chmod +x filename), then there is nothing stopping you from running it as if it were a program. Of course if the file in question just happens to be an email to your mom, then Unix won't have much of a notion as to what to do with the program its running, and if the gods are smiling on you, it will return some sort of error and pretend the whole nasty incident never happened. Will it let you try again? Yup.... so if you really want to push your luck....

By the same token, it's not entirely fair to say that a program has to be compiled binary (machine readable) content. While it is most often the case that the programs you run are compiled binaries, some of these programs are special types of programs called script files, which are slightly more human readable, and run another program, called an interpreter, that executes the script. But now we're splitting hairs.

For now, it's probably safe to say that you're not writing any of your own programs quite yet, so just remember the Golden Rule (when in doubt, don't touch it), and you should be fine.

Top

3.2. The Execution Path

Execution Path is a big word that has a fairly simple meaning. In Unix terms, a path is the list of directories that one must go through to get to a certain location (like /c/home/your_user_name is the path to your home directory). The Execution Path is the location of an executable program. If the ls program is located in your /bin directory, then its execution path is /bin/ls. Simple? Of course it would be a nuisance to have to type /bin/ls each time we wanted to list a directory, and so the system maintains a variable that contains a list of directories in which to search for programs we are allowed to run. If it doesn't find the program in one of these directories, it will find it in another. Typically, a $PATH (thats the name of the variable that contains the list) is not extremely large, and so if there is any noticeable delay in finding the right program, you're most likely going to see it as standard execution time for the program you're running. Unix is tricky that way.

So why do we care about the execution path? Mostly we don't, at least not until we start installing new software, which can be in and of itself a cumbersome task. However, since we are talking about running programs, I need to discuss the execution path because, at this point, it is efficient to divide programs into two categories. Those that are in your $PATH, and those that are not.

Top

3.3. Running Programs that are in Your $PATH

Type its name on the command line. Yep, it's that simple. When we run a program, we must specify the complete execution path to that program, but if the program is in one of the directories that is in the $PATH variable, then all we need do is type its name on the command line.

Top

3.4. Running Programs that are not in Your $PATH

Type the full execution path on the command line. For example, pretend we live in a twisted parallel universe where /bin is not in our $PATH variable. To execute the ls program (and indeed many other useful ones), we would need to type /bin/ls on the command line each time we wanted to run it.

Top

3.5. How do I Find Out What's in My $PATH???

Honestly, this isn't something you need to worry about right now. If something isn't in your $PATH, you probably don't need to worry about it, but for the curious:

echo $PATH will print the contents of the $PATH variable on the screen. You are running the echo program, which simply returns its input, and by giving it the complete $PATH variable as input, it will then proceed to print out the complete list of directories that qualify as being in the default execution path.

Top

3.6. Adding New Entries to your $PATH

Again, not necessarily a good idea. Before I say anything else about this, allow me to first address the enterprising among you who have figured out that you can put the current directory symbol ("./") in your $PATH. Don't do it. Putting the current directory symbol in your $PATH is a huge security risk, and if you should ever happen to have a file name that matches the name of some program you want to run, you may be unable to do so.

If you are bold and daring, and want to add something to the $PATH variable: PATH=/new/execution/path where /new/execution/path is the full path to the directory that you want to add. Again, this is something that I highly suggest you don't do for now.

Top

4. Writing and Editing Text

Plain text files are the flavor of choice for source code for compilation as well as configuration files, and so they are definitely something you are going to need to deal with. As such, I am now going to give you a very brief introduction that will describe how you can create, open, change, and save text files in Unix. Most Unix text editors are very old and robust, and can do just about anything you could possibly imagine. Those functions are a bit beyond the scope of an introductory tutorial, however, and if you really are interested in learning more, check out the reference section below, or the man page for the editor of your choosing.

That being said, Unix has a lot of different text editors, and many text editors have different versions that are aimed at special purpose use (such as small size, more features, programming language specific, etc.). This being the case, there is no way that I dare to dream of covering ALL of them here. Instead, I am going to stick to two of the most common editors found in a Unix environment: Emacs and vi.

Top

4.1. Editing Files with vi.

vi (vee - eye) is one of the oldest text editors around on Unix. It is, in its most basic form, smaller and quicker than Emacs, but the trade off is that it has less advanced functionality. Still, it makes a very good choice; if you find yourself in a rescue disk situation, chances are that you're not going to have Emacs available, and if you're not familiar with how vi works, you'll find yourself needing to learn a new editor as well as having to fix your system quickly. That being said, vi is quite pluggable, and if you really do need it to do something that it does not already do, chances are that someone has written a plug-in to do it. Also, because vi is built upon the ex editor (an editor designed for editing files without human interaction), it is also scriptable, and so later on in life you will be able to "program" with vi to get it to do cool things for you (although Emacs also has this capability).

To start editing a file in vi, type vi filename filename is the name of the file that you want to edit. If the file already exists, then vi will display the contents of the file for you in the editing buffer. If the file does not exist, vi will create an empty buffer for you, but the file will not actually be created in your directory tree until you save it.

Top

4.1.1. vi's Modes

Now you've got the file open, and this is where is gets a little bit tricky. The vi editor has a few different modes that it can be in (not to be confused with the modes that determine file permissions in the directory tree). These modes have different commands available, and if you're not in the correct mode, you won't be able to do certain things. Let me briefly introduce you to some of the modes available in vi, and what (in general) you can do in each.

Command Mode is the default mode for vi; it is the mode that vi starts in. From command mode, you will be able to enter commands that allow you to copy, cut, search, and paste text, as well as the commands that will allow you to save your work. If you're not sure what mode you are in, you can press the "ESC" key to return to command mode. Hitting the escape key in command mode will generally not produce any untoward effects, so it is most likely a safe thing to do.

Insert Mode is the mode that you will need to enter in order to write text to your file. While in insert mode, you can generally only enter text at the current cursor position; in classic vi you will not even be able to move the cursor around (although some of newer versions of vi will allow you to do this). There are two ways of entering insert mode: if you press the "a" key in command mode, then you will begin inserting text one sp-ace in front of the current cursor position. If you press the "i" key, you will be able to begin inserting text at the current cursor position.

Okay, so now that you know about the different modes, it's time I started showing you how this editor works. First, we'll need some text to edit. Go into insert mode (see above) and type in some text (any old text will do), and then we'll start with some of the editing process.

Top

4.1.2. Saving your Files in vi

Now that we have some text in the buffer, the first thing we'll want to do is to save it. To save your file, enter command mode (press ESC), and type :w in the buffer. By way of explanation, the colon (:) character is needed for vi to properly interpret any vi system command. That is not to say that all commands will begin with a colon, but anything that requires vi system interaction will.

If you started vi with an anonymous buffer (you didn't specify a file name when you started vi), or you want to change the name of the file you've opened, then instead you would type :w filename where filename is the name of the file you want the buffer to be saved as.

Top

4.1.3. Deleting (Cutting) Text in vi

In traditional vi, you can delete text using the backspace key only until you reach a point where you enter command mode, after that time you will need to use the vi commands for deleting text (although newer versions of vi may allow you to delete using the backspace key).

Fundamentally, deleting and cutting text is the same thing when using the vi delete command. vi will remove the text you have chosen to delete, and will keep it in a temporary buffer, where it will remain until you delete or copy more text, or until you exit vi. While the text is in the temporary buffer, you will be able to paste it back into your documents. However once the text is gone from the temporary buffer, you will have to re-enter it by hand if you want to put it back into your document.

Top

4.1.3.1. Deleting Complete Lines of Text

This is accomplished by positioning the cursor on the line of text you wish removed from your document and pressing dd while in command mode. You can delete multiple lines by typing ndd where n is the number of lines you wish deleted. In this case, it will delete the line you currently have the cursor positioned on and n-1 lines below that. This technique is most useful for deleting large blocks of text.

Top

4.1.3.2. Deleting Individual Words of Text

You can delete words of text by typing dw in command mode. This will delete whatever word the cursor is positioned on at the time (up to the next whitespace character). If you have the cursor positioned on a white space character then the whitespace character will be deleted up to the next printing character in the document. You can deleted multiple words by typing ndw in command mode, where n is the number of words you want deleted. Depending on your version of vi, blocks of white space characters may or may not be counted as words when using this method, so be sure to experiment.

Top

4.1.3.3. Deleting by Individual Characters

It is also possible to delete items from your document by individual characters. This can be accomplished by typing dl while in command mode. The character that the cursor is currently positioned on will be deleted from your document. Not surprisingly, you can delete multiple characters by applying a similar formula as the other methods of deleting text described above. Typing ndl in command mode will delete n characters from your document, starting with the one that the cursor is currently positioned on.

Top

4.1.4. Copying (Yanking) Text in vi

It is often times advantageous to copy text from a document without deleting it from its original position. In vi, this operation is called "yanking," but it is not especially different from the copy operations you are already familiar with. It is important to note, however, that yanked text goes to the same temp buffer as deleted text, so if you have cut something for later use, and then copy some other text, the text that you have cut will be gone (although more recent versions of vi may keep these in two separate buffers, so either experiment or check the documentation that came with your version), so be careful when doing this.

Top

4.1.4.1. Copying Complete Lines of Text

You can copy an entire line of text from your document by typing yy while in command mode. Similar to the method used for deleting entire lines noted above, if you type nyy in command mode, where n is the number of lines (including the line the cursor in currently on) you want copied, the line the cursor is positioned on and n-1 lines below that will be copied into the temporary buffer.

Top

4.1.4.2. Copyng Individual Words of Text

To copy from the cursor's current position to the end of the word, type yw while in command mode. If you want to copy more than one word, type nyw where n is the number of words (including the one the cursor is currently positioned on) that you want copied into the buffer.

Top

4.1.4.3. Copying Individual Characters

You are likewise able to copy an individual character by typing yl while in command mode. To copy multiple characters, type nyl while in command mode. This will copy n characters, starting with the character that the cursor is currently positioned on.

Top

4.1.5. Pasting from the Temporary Buffer into Your Text

In order to paste something into your text, you will need to have first gotten something into the temp buffer by either deleting it or copying it in there. See the sections above if you're not sure how to do this, and copy or cut some text into your temp buffer to test this.

To paste text that has been cut or copied from your document into the temporary buffer, position the cursor at the point where you want to place the text, and press p while in command mode. Yes, it really is that simple.

Top

4.1.6. Searching Your Document for Text

It often happens that you need to search a document to find a piece of text. If, for example, you've discovered you've been mis-spelling the name of an author when you're writing a paper, the best way to find each instance of the mis-spelled name is to use your editor to find each occurrence of the pattern. That way you can be sure you've got them all. There a couple of ways to search a document for a specific pattern of text, which I will now discuss.

Top

4.1.6.1. Searching Your Document from the Top Down

To search from your current position to the end of the document, type /pattern while in command mode, where pattern is the pattern of text that you want to search for. This will cause vi to start searching your document from the current position toward the end for pattern. vi will stop when it has found the first occurrence of the pattern that you were searching for.

Top

4.1.6.2. Searching Your Document from the Bottom Up

To search the document from the current position to the beginning of your document, type ?pattern in command mode, where pattern is the pattern of text you want to search for. This will cause vi to start at the current cursor position and begin to look backward for the pattern. vi will stop at the first occurrence of the pattern that it finds in the text.

It bears mentioning that the pattern that vi find need not be a complete word. For example, if you search for each instance of the word "you" (/you), vi will stop at words such as "you're", "your", "you'll", etc.

Top

4.1.6.3. Navigating in Search Mode

If you have found the text you are looking for in your search, simply pressing the "ENTER" key will return you to command mode. If, however, you would like to keep searching, you can either press the "n" key to continue searching in the same direction in which you were already searching, or you can press the "p" key to search for the pattern in the opposite direction.

Top

4.1.7. Quitting vi

To end your current vi session, type :q while in command mode. Depending on your version of vi, it may warn you if you have not saved. It is possible to issue a "save and quit" command by combining the two commands into one: :wq in command mode; this option is usually a good idea.

Just remember that any unsaved buffers, your temporary cut and copy buffer included, will be lost when you quit vi.

Top

4.1.8. Recap

Top

4.2. Editing Files With Emacs

Emacs is, in many ways, what vi is not. While most versions of vi tend to start in a "bare bones" mode, Emacs generally starts by loading up all of its extensions by default. Thus, you generally have a lot more at your disposal when you first start out than you do when starting a vi session. However, when it comes to the sort of basic text editing that I am showing you how to do here, you will find little (if any) difference between using vi and using Emacs.

Emacs is probably going to be a little bit more familiar to those of you who grew up using WYSIWYG word processors. When it starts, it starts in a traditional editing mode that lets you start entering text right away. Entering various Emacs commands requires some fancy key pressing, but not a hell of a lot, and much like the initial oddities of vi's mode swapping, remembering which commands require what keys is something that is easily learned.

To start editing a file with Emacs, type emacs filename at the command line, where filename is the name of the file you want to edit. If you omit a file name, Emacs will start in a temporary buffer, and you can create your files from there.

Top

4.2.1. Entering Emacs Commands

As I have already said, Emacs starts out in its primary editing mode: when you start Emacs, you can just start typing text into your file. If you want to give Emacs a command of some sort, you will need to use command and meta keys to access the command buffer (the place where you issue commands) before you can enter the command you want to use. The command key that Emacs uses is, by default, your control (CTRL) key on the keyboard (abbreviated here-after to C-). The meta key is, by default, the escape key (ESC) on the keyboard (abbreviated here-after to M-).

When you issue a command, you hold down the required command/meta key, and then press the key that is associated with the command that you want to use. Sometimes this will require pressing more than one key, which may or may not require the command/meta key being pressed at the same time.

Confused? I'll admit, this one is a bit confusing at first; perhaps more so than vi's method of entering commands. Once you've memorized what commands require which key combinations, however, you will find that Emacs does offer several options that vi does not.

Hmm... maybe an example is in order: take the command for saving your file: C-x s. This means that, while holding down the CTRL key, press the "x" key. Then release the CTRL key and press the "s" key. This is why command/meta keys are displayed as C- and M- respectively; the dash indicates that you need to press the command/meta key while pressing another key. Got it? Gosh I sure hope so, because it's time to look at some of those Emacs commands...

Top

4.2.2. Saving Files in Emacs

It's great to be able to save your work, no? To save the file that you're currently working on, type C-x s. You will then be asked whether to not you want to save. Press the appropriate key to suite your mood ("y" to save).

It is also possible to simply save your file without first being prompted. To do this, type C-x C-s.

If you started emacs without specifying a filename, Emacs will prompt you to enter one. It's just user friendly like that.

Top

4.2.3. Deleting Text in Emacs

This is vastly different than in vi, and as such, I'm going to present this separately from the topic of cutting text. Since Emacs is, to a certain extent, a WYSIWYG editor, it is generally enough to position your cursor at the appropriate point in the text, and press the BACKSPACE key until the text you want deleted is gone. Unlike in more traditional vi versions, you can always do this, whether you have just saved the text or not. But just like in a WYSIWYG editor, the text is deleted, not cut.

Top

4.2.4. Deleting (Cutting) Blocks of Text

Again, this is where Emacs belies some of its WYSIWYG nature. Unlike vi, when you want to cut blocks of text from your file, you do not need to concern yourself with doing it by a number of lines, words, or characters: you define the block of text to be cut from the document. As you've probably guessed, this leads to incredible precision in what gets cut from the text. However, before you can do any cutting, you first need to define the beginning of the region that you want to cut.

To define the beginning of the region for cutting, position your cursor at the appropriate point in the text and press C-SPACEBAR. This will set a mark to that particular point in the document. It is worth noting that the mark will be set there until you close the document, or until you set it somewhere else within the document.

Now that the mark is set, you will need to move your cursor to the end (or beginning, if you're going backward) of the region that you want to cut. Once there, press C-w to cut everything between that point and the mark. It is worth noting that whatever character lies under the cursor will also be cut.

As I hinted, the direction that you cut does not matter. You can place the mark at the bottom of your file, and position your cursor at the top, and the procedure will work precisely the same as if the mark were at the top and the cursor at the bottom.

As you have probably guessed, text that is cut in this way can then be pasted back into the document, but only as long as you don't cut or copy anything else into the temp buffer. Once you do, the text that is currently stored there will be lost.

Top

4.2.5. Copying a Block of Text

Copying a block of text in Emacs works in much the same way as cutting the block of text. To begin with, you must define the beginning of the region that you want to copy (See Deleting (Cutting) Blocks of Text for more info).

Once you have set the mark on the outer boundary of the region of text that you want to copy, simply position the cursor at the opposite boundary of the region (remember, what's under the cursor gets copied), and press M-w. This will copy the text into a temporary buffer without actually deleting it from your document. Just remember that anything that was in the temporary buffer will now be gone.

Top

4.2.6. Pasting Text into Your Document

This is about as easy as it gets. Presuming that you have something in the temp buffer (i.e., you have already cut or copy something there), simply position the cursor at the point in your text where you want the cut/copied text to go, and press C-y.

Top

4.2.7. Searching your Document for Text

Emacs gives you the ability to search for a pattern of text in your document just as vi does. You can either search toward the top or the bottom of your document, but remember that anything matching the string you search for will be matched regardless of context; searching for "the" will match "the", "there", "theater", "them", etc.

Top

4.2.7.1. Searching from the Top Down

To begin searching your document from your current position toward the end of your document, use M-x search forward and press the enter key. You will them be prompted to enter the text you want to search for.

Note that in the above code you can press the spacebar, but in the command buffer a dash (-) will be displayed. This is simply a naming convention of Emacs extensions, and a convenience in your settings. You can type a dash instead of a space, but it often feels more comfortable to use a space.

Top

4.2.7.2. Searching from the Bottom Up

To search from your current position toward the top of the document, use M-x search backward and press the enter key. You will be prompted to type in the text you want to search for. After you type that in and press enter, Emacs will find it for you.

Note that in the above code you can press the spacebar, but in the command buffer a dash (-) will be displayed. This is simply a naming convention of Emacs extensions, and a convenience in your settings. You can type a dash instead of a space, but it often feels more comfortable to use a space.

Top

4.2.7.3. Navigating in Search Mode

Fooled you. There really isn't a "search mode" per se in Emacs. You run a search command, which will find the first instance of the search text in whatever direction you have specified. Once it has found that instance, it will stop until run again. So if you didn't quite find what you were looking for the first time around, run the command again.

Top

4.2.8. Quitting Emacs

While you might want to save first, when the time comes that you have finished editing whatever it was that you were editing in Emacs, use C-x C-c to quit.

Top

4.2.9. Recap

Top

5. Fun with Files

By now I'm sure you're having all sorts of fun using Emacs and vi to create files for yourself, but the question remains, what exactly can I do with these files? Oh sure you need them for storing your data, but every now and again, you'll want to open those files for read only, move them to a different directory, rename them, delete them, maybe even make a shortcut to a file in a different directory. Now that you actually know how to create some files, it's time to take a gander at some of the basic file utilities that Unix and its cousins have to offer. Some of this is pretty standard fair, but a few things will be new to you if you're new to Unix.

Top

5.1. Viewing a File with more

Often times it is desirable to view a file without actually opening it in a text editor. Even if you have marked the file as read only (or have set your text editor to do so when you start it), starting a text editor can carry with it overhead that can make opening a file for the simple purpose of reading it a very cumbersome task if you have an older (read: slower) machine, or if you are connected to the Unix machine over a slow connection. Fortunately there is a lightweight alternative that is almost ubiquitous on Unix systems: the more command.

More is a program that allows you to open a file for reading, navigate that file, and even search that file for specific lines. As with many Unix program, it only does this, and so it is very good at it, is fairly simple and strait forward to use, and has a small memory footprint (which means that it won't be too cumbersome to use on an older computer or a slower internet connection. A large file on the other hand.....).

The syntax for more is more filename; where filename is the name of the file that you want to open. You are allowed to pass a few command line options to more, but I'm not going to address those here. See the man page if you're really interested in those.

Once you start more, you will be looking at the beginning of the file you have opened. At that point you have a few options.....

Top

5.1.1. Navigating more

After you have started more, you can move forward in your file one line at a time by pressing the "ENTER" key. Alternatively, you can move forward one page at a time by pressing the SPACEBAR.

You can also move backward after you have advanced a bit through your file. Press the "b" key to go back one page.

Top

5.1.2. Searching for Text in more

Awful thoughtful of Shienbrood, Peck, and Foderaro (the nice fellows who wrote more) to include the ability to search for a pattern of text in more. It increases the usefulness of this program by oodles. If you want to search for a pattern of text in more, just remember that you are searching for a pattern, and so "the" will match with "there", "theater", "them", etc.

To begin a search for a pattern of text in more, type / pattern where pattern is the text pattern that you are looking for. If you are not satisfied with the first occurrence of the pattern and want to move on to the next, press n.

Top

5.1.3. A Couple more Tricks (Ha ha, get it?)

You can display the current line number in more by pressing =.

If you find a problem with the file and need to change it right away, you can press v to start a vi editing session at your current position in the file.

If you forget everything else, remember that you can press h to get a list of some of your navigation and command options.

Top

5.2. Moving a File with mv

Eventually it will happen that you want to do a little bit of spring cleaning in your home directory. You'll make some sub directories, and want to file away your work. Or maybe you want to archive your emails to another location. The mv command to the rescue!

The syntax for mv is mv source_file destination_file where source_file is the name of the file that you want to move, and destination_file is the name of the file that you want to move it to. There are a number of command line options available, but you'll have to look those up on your own (I bet you know where, too).

Source file is pretty self-explanatory, but destination file may bear some explanation. This simply means the path to the place you want to move the file to. If you only specify the directory name, then the file name is implied (the name is has in the current directory): "mv myfile mydirectory/myfile" and "mv myfile mydirectory" are equivalent.

It is probably also worth noting that the source file is generally deleted, and while most systems won't over-write an existing file with the same name, yours might, so you may want to try it out on an empty file before you start using the command liberally.

Top

5.3. Renaming files

The shrewd among you will have noticed that in mv I called it the "destination file". So what if I want to move the file to a directory with a different name, you ask? It's as easy as you thought: "mv myfile mydirectory/thisfile" will move the file myfile from the current directory, and place it in the "mydirectory" directory with the name "thisfile", effectively changing its name.

The same applies to renaming a file in the current directory; just specify the destination file as a different file name in the same directory as the original file. i.e. "mv myfile thisfile" will "move" the file "myfile" to the file "thisfile" in the same directory, without changing the files contents; effectively you rename your files by moving their contents to a different file.

Top

5.4. Copying your Files

In the event that you ever want to make a copy of a file to another directory, or you want to copy the contents of one file into a different file name without changing the original, Unix provides us with the cp command. The syntax is: cp source_file dest_file where source_file is the original, and dest_file is the copy. If you are copying the file to a different directory, dest_file must include the path to that directory.

Remember, using cp, the original source file will be unaltered, making this command great for making a backup copy when you are working on a project that may want you to "un-commit" certain changed.

Top

5.5. Deleting Files

Eventually the time will come when you just want to throw old things away. When data is out of date, and you're not concerned with archives, it's time to delete those files; in Unix parlance we remove them from the directory tree. To do this, we use the rm command; the syntax is rm filename. It may surprise you to learn that filename is the name of the file that we're deleting from the directory tree, but I sure hope not ;)

Be advised that, baring paying someone a few thousand dollars to go over the contents of your hard drive with an electron microscope, using rm to delete something is permanent. Once you have removed something from your directory tree in this way, there is nothing practical that you can do to get it back. Thus, you should use this command with extreme prejudice.

It is probably also worth noting that this method is not a secure way of removing data. If you are dealing with sensitive material that must never fall into the wrong hands or your life/liberty/career would be in serious jeopardy, it is highly recommended that you over-write any data files with junk data (just keep mashin' that keyboard) before deleting it. Then again, if you're the sort of person who deals with such sensitive data, you probably all ready knew that.

Top

5.6. Printing Files

This section presumes you already have a working printer setup on your system.

To print your file, use lpr filename, where filename is the name of the file that your want to print. Remember kids, Unix does not care that your file is not human readable, and will try to print a great big binary executable just as if it were your grocery list for the week, so be careful about what file you tell it to print.

If you want to print multiple copies of a file, use lpr -#n filename, where n is the number of copies you want to make, and filename is the name of the file to be printed.

Since, for now anyway, the majority of the files you will likely be printing will be plain text files, it may also be worth noting that there is a way to format the files as you send them to the printer to make the output look more pretty. Use lpr -p filename to print filename with some basic formatting.

As always, check out the man page to see what other command line options may be available to you.

Top

5.7. Linking files to One Another

This is simple in practice: you can create a file that is a twin of another file anywhere else in your directory tree, or you can create a file that points to another file in your tree in a way similar to that of a hyperlink pointing to a different web page. Of course to understand what is going on requires a little bit of background, so I am going to split this up into three section. Feel free to skip the first if you don't give a lick about what your machine is actually doing (note, those of you who do not understand the way files work and skip the first section will never be 1337).

Top

5.7.1. Dissecting a File

Okay, so we all know what a file is in principle: it's a collection of data that is classified together under one name. Kind of like stuffing a bunch of loose papers into a folder and slapping a label on that folder. Here we need to make a distinction though: the contents of that folder is the file, whereas the folder label itself is the file name. I've been using the two interchangeably up until now, but we need to start thinking of them as two separate entities.

The reality of memory and storage is that things are rarely neat and \ clean on the drives. As old files are removed and new ones created the drive becomes fragmented (windows users, you should all know about defraging). Thus when you are working with a file in memory, the memory blocks that the file occupies are most assuredly not contiguous, and I would be willing to bet that when that file is saved to your hard drive, the same is true. However, when we begin reading a file, we need to have a reference to that file so we can find the first chunk of it in the system. That's where the file name comes in.

Repeat after me: "The file name is just a reference." Got that? When you remove a file from the system, you are not actually deleting the contents of that file; that's why the FBI and your arch rival corporation can still get to the data. Instead, you are merely removing any reference to that chunk of data from your file system. Since there is no direct way of retrieving the contents of the file, the operating system is free to over-write it. That's why the term "free space" is not nearly as accurate as "available space" when referring how many GB are left for your mp3 collection. The section of the hard drive most likely do have something on them, it's just space that is available to be over-written with more pertinent data.

Another good name for a file name is a "link." Back to the example of the folder: say I want to make a copy of the folder. I photocopy everything, stick it in a new folder, and write a new label on that folder, and put it in my filing cabinet. This takes up more resources: more paper, a new folder, a new label, just like making a copy of a file on your directory tree takes up more storage space on your disk, as well as a new reference to that data (the file name). This is where digital beats paper every time.

In the digital world, physical space is an unnecessary abstraction. Who says I can't have more than one reference to the same group of data? The incorrect, that's who. Instead of incurring the overhead, or taking up the space, of making an actual copy of the file, I can simply create a new reference to it; effectively giving the same file two completely different names.

This technique is extremely useful for when you want the same file in two different directories. For example, if you're working on a set of XML documents that all will use the same DTD, but you are storing these XML documents in different directories for organizational purposes, it only makes sense to make a new link to the DTD instead of making a copy of it. That way, when you make a change to the DTD, that change will be visible to all of the directories without needing to edit anything else. Remember, multiple references to the same file.

Now that you understand the basics of file mechanics and what a link is, I will introduce the practice of creating new links. The two types of links available are hard links and symlinks (a.k.a. soft-links). I will explain the differences between these two when I discus their creation.

Top

5.7.2. Making a Hard Link

I have already talked about hard linking above, but to recap, when you create a hard link to a file, you are not making a copy of that file, you are creating a new reference to a file that already exists on your system. If you change the contents of the file, this will be reflected in both references (because there is only one file). One file, many references. The other side of this coin is that as long as a reference to a file exists, the space it occupies will not be reclaimed by the files system, so creating a hard link is a way of preventing accidental deletion (provided, of course, that you don't remove the hard link too).

The ln command is what we use to create a hard link to a file, the syntax is ln source_file dest_file where source_file is the original reference to the file (i.e. the original file name), and dest_file is the new reference to the file. If you want to create the hard link in a directory different from the current one, you must specify the path to that directory as well as the file name as dest_file.

Top

5.7.3. Making a Symlink (Softlink)

I haven't talked very much about symlinks yet, and there is an important difference between symlinks and hard links. Whereas a hard link is a new reference (filename) for a file that already exists, a symlink is a new reference (filename) for a currently existing reference (filename) . remember that a hard link is a reference to data, and that a soft link is a reference to a reference, and you should do fine.

So why would we use a soft link instead of a hard link? For starters, you may not have a choice in the matter. The permission structure of certain files is such that you cannot always create a hard link to a file: one of the prime examples being that the system will not let you create a hard link to a file that you do not own. Pretending you were working on a program to extract data from your systems /etc/passwd file, and you didn't want to type out the full path each time, you could make a symlink to /etc/passwd called "passwd" in your home directory.

Let me continue to use this example to illustrate where symlinks are different than hard links. Since a symlink is a reference to a reference, if anything should happen to the original reference, our symlink will no longer work even though the file may be quite intact. Suppose the (somewhat deranged) system administrator decides to change /etc/passwd to /etc/buddies; since our symlink was a reference to /etc/passwd and /etc/passwd no longer exists, our soft link is no longer valid even though the data is fine.

Another reason to use a symlink is when you want to limit the direct access to the file. Some day, when you're installing software, you will soon realize that it is often easier to create a symlink to the executable, and place that link in a directory in your path (/usr/bin is usually a good choice), rather than to clutter up the executable directory with a whole bunch of libraries and config files that exist just fine on there own in the install directory. When you open a file, or execute a file, through a symlink, it executes in the directory where the file is located, not the link, and so in the case of our installed program, we want it to read its config files each time.

To create a soft link to another filename, use ln -s target_file link_name Where target_file is the name of the filename that we are linking to, and link_name is the name that we want to give to the symlink.

Well that about does it for our introduction to some cute and clever things you can do with files. There is more, of course, especially when you start wanting to make word-processed documents and write programs, but those are all topics for another day. Now I think it's time we moved on to another subject that is so near and dear the heart of everyone who has ever lost something; how to find things in Unix.

Top

6. Seek and Ye Shall Find

As the size of your directory tree grows, and you have more and more files to pick through, finding the right file, or the right part of a file can become a time consuming task. Fortunately, Unix gives us the means to search through the file system to find individual files, and to search through individual files to find specific patterns of text.

Top

6.1. Searching the Directory Tree

Whether you're searching for a file your wrote buried somewhere in your own directory tree, or trying to find a little lost config file, log file, or executable somewhere in the larger directory tree, the appropriately named find command is just the tool you need to get the job done. The syntax for find is find path -name pattern -print where path is the path to the directory where we will begin our search, and pattern is the pattern that we are searching for. The final directive (-print) is one of a couple different possibilities, but for now a simple printing of all the search results to the screen should suffice.

By way of example, support you are currently in your home directory, and want to search through all of your files for a file named "diary.txt". You would use the command find . -name diary.txt -print to search through all of your sub directories to find this file. Anything that is in a subdirectory of yours and matches "diary.txt" will be printed to the screen as it is found.

Pretending that you want to search through the entire directory tree for a file called "test.txt", you would use find / -name test.txt -print, and just as in the above case, find will start with the system root directory and parse through all subdirectories on the system, and will display the location of any file named test.txt.

An important thing to remember about using find is that the larger the scope of your search, the longer it is going to take to find what you're looking for. Searching through your home directory's tree is going to take considerably less time than searching through your system directory tree. This obviously becomes less of an issue on a higher end system, but it is something worth considering. If you are looking for something that must be somewhere in your home directory tree, there's no reason to perform a full system search for it.

There are these handy things called regular expressions that make using find more efficient, and will let you be a little more sloppy in specifying what patterns you are looking for, but more on those a little bit later. Now we're going to jump into looking through files for patterns of text....

Top

6.2. Searching your Files for Text

The editors and text file readers we've looked at so far have had the ability to search the file for a specific pattern, but there is a utility that allows us to search files for patterns of text without actually opening the file in a reader or editor. Given a pattern of text and a filename, grep will search that filename for the line pattern and print any matching results to the screen. There are a few options that can be passed to grep that make this a bit easier, and as with the find command, regular expressions (more on those in a bit) allow you to search multiple files for complex patterns. For now, though, we're going to look at the bare bones grep syntax and build from there.

The syntax for a simple grep search of a text file is grep pattern file where pattern is the pattern of text that we're looking for, and file is the name of the file that we are going to search. Grep will display any matches that it comes across on the screen, including the line that the pattern was found in, before it finishes running. If you know exactly the the line looks like, or you just want this one line, then this should be sufficient, but most of the time you're looking for a little bit more.

If you use grep -n pattern file grep will go that extra mile and will prefix each "hit" in the file with the line number that the pattern was found on. This is especially useful if you've found out that you have misspelled a certain word consistently throughout a large file. Of course it also comes in handy when you're trying to find an exact quotation, but you only know part of the quote.

Using grep -w pattern file will match only those instances of pattern in file where pattern matches a whole word. For example, whereas with a normal grep search, "the" will match with "there" and "them", if you use the -w option, then "the" will only match with "the".

Another nifty use of grep is grep -n pattern file where n is the number of surrounding lines to display along with the line in file in which pattern matches. This will better allow you to evaluate the surrounding content to figure out if this is the match that your looking for. An example of this usage: grep -4 Ophelia hamlet.txt will look through the hamlet.txt file, and for each time it matches "Ophelia", it will print the four lines that precede the match, the line that matches, and the four lines that follow the match.

These command line options will generally work together, so you can mix and match as you see fit. See the man page for a few more of them.

Top

6.3. Whirlwind Introduction to Regular Expressions

By now you've heard me throw around the term "regular expression" once or twice, and this is a good time to explain what a regular expression is, why they're so very helpful in searching, and how you can use them as patterns in conjunction with grep and find.

First a bit of a definition: a regular expression is a pattern that is built from a combination of characters and special meta-characters, which can be used for purposes of matching. I should probably warn you that, in the strictest computer science terms, this is not a correct definition; rather it is a functional one that will help you understand the process of creating a pattern for searching. If you want a more accurate and in depth definition of a regular expression, check out A Tao of Regular Expressions.

If you're thinking of a regular expression as a pattern constructed of building blocks, then it is important to know what those building blocks are. On the one hand, we have characters, which are most easily defined as any printing character that stands for itself. For example, "this" is a regular expression composed completely of characters. Each character in the regular expression "this" stands for itself (assuming we are trying to match against the word "this").

On the other hand, we also have meta-characters, which are characters that stand for something else. This is where we start to experience the true power of regular expressions in our searches, as the ability to use these meta-characters will allows to search through multiple files for some obscure data with a single command. While there are quite a few different meta-characters available in the regular expression vocabulary, I'm only going to discuss two of them: dot (.) and star (*). The reason for this being that these are the two most common meta-characters that you will be using (for now).

In terms of regular expressions, the dot is used to symbolize any single character. The regex (short for regular expression) pattern "t.e" would match "the", "tie", "tqe", "tke", etc. However, it would not match "three", for example, because there is more than a single character there. Truthfully, you will most likely not get a lot of use out of this particular meta-character until you start writing scripts that use regular expressions, but in many ways this is the foundation of regex meta-char thought; how does one thing substitute for something else? Once you start to see the possible matches for "p.t" in your mind, then regex mastery is well within your grasp.

Arguably, the most useful regex meta-char available is the star (*). The star is used to symbolize anything or nothing; in mathematical terms, the star (which is formally known as the Kleene Star) represents an occurrence of zero or more characters. For example, "th*" will match anything that begins with the letters "th", just as "*th" will match anything that ends with the letters "th". The usefulness of this should already be more than apparent.

The find command provides us a wonderful opportunity to show the usefulness of the star operator in action. Suppose we want to find the location of every text file that we have written in our home directory tree: find . -name *.txt -print. This does of course presume that we are all ready in our home directory, but it will display the location of every single file that is contained within a sub directory of the home directory that ends with ".txt". This is eminently useful when you are looking for groups of files in your system, and in this context, the star operator will work with many other commands as well. Try ls *.txt to list every file that ends with ".txt" in the current directory. If you want to delete all of the files in a directory.... on second thought, if you REALLY want to shoot yourself in the foot, you're going to have to load the gun yourself :)

A similar application is useful when greping through a file (or multiple files). Suppose you have a directory filled with text files that are all short stories by Edgar Allan Poe, and you want to go through them all and pick out each and every reference to ravens (sometimes Lit majors do things like that, and for fun no less. Pity me). grep -n .aven *.txt will do the job just fine, and because we use the dot operator, it will match both uppercase and lowercase "r", so it won't matter whether "raven" is capitalized or not. Bonus points for you if you can tell me what other common words this will match against!

That about does it for our look at finding wayward souls that have gone astray in our file system (somewhere). Practice these techniques, they will come in handy, and may save you hours of work and frustration.

Top

7. Network and Internet Utilities

It may be a bit of a lofty assumption, but I am going to pretend that, for this section, you already have your internet connection setup and active. The reason for this being that there are a number of good HOWTOs out there that will match just about any connection need that you have, and I don't see any need to re-invent the wheel. If you're trying to get your network connection up and running, see the References section below for some places where you can find help.

For those of you who do have your network connection off the ground, it's time for you to meet a handy dandy group of command line utilities that will let you do just about anything you might want to do on the Internet from the comfort of your terminal window.

Top

7.1. Checking a Host with ping

This is useful primarily for testing your network connection, but it can also be handy to see if a server you're trying to access is up, or if it might just be a slow connection, etc. The basic syntax is ping hostname where hostname is the name of the remote computer (i.e. www.axisoftime.com) that you are trying to reach. If the host is up (and responding to ping requests; be aware that some net admins disable responding to ping requests) you will get a reply that gives you statistical information as to how long it took between sending a ping and receiving a reply. Otherwise you will get something like "request timed out" or "host not reachable." Either way, this is a handy utility that we just couldn't overlook.

Top

7.2. Transfering Files using FTP

Let me start out by saying that the FTP client that comes installed with most Unix/Linux systems by default is no where near as cool as the ncftp client. In fact, before you try running "ftp", you may want to try running "ncftp" to see if you have this much more slick alternative. Still, I am going to include a discussion on the standard Unix FTP client because it is, well, standard. On a minimal system or a rescue disk, if you are lucky enough to have an FTP client available, it will probably be this one. In all other cases, get yourself a copy of ncftp and use that, it is far superior.

Top

7.2.1. The Standard FTP Client

While not as robust as ncftp, the standard ftp client will be available to you in times when ncftp won't be, and that makes it worth knowing how to use. After all, when push comes to shove it can still get and put files to the same places, it just lacks some of the very useful bells and whistles that ncftp offers.

To begin your ftp session, open the ftp prompt by using ftp hostname where hostname is the name of the FTP server you are trying to connect to. If you do not specify this value now, you will have to use open hostname after you have entered the ftp prompt in order to connect to the FTP server. In either case, once the ftp client has successfully connected to the server, it will prompt you for your username and password, enter these, and you will be logged on to the system, and ready to send/receive files. If the server supports anonymous or guest logins, use "anonymous" or "guest" for the login name.

You should be logged into the FTP server now. At this point, you may want to use binary, which will force all transfers to be as binary format (so you can grab things besides text files). Now you have a few options. Use get filename to get a file named filename, and use put filename to put a file named filename onto the remote computer. You can similarly transfer multiple files using mget filenames and mput filenames to get and put multiple files respectively. In both cases, filenames is a list of files for placement or retrieval, and this does support the use of the regex star in the same way that many local commands do. For example, if you use mget *.rpm, you will begin to download every rpm file in the directory.... which may take some considerable time.

You should be able to use ls, and cd as normal to navigate your away around on the remote system, but there is a special command that you will need to use to navigate on your machine while running ftp: lcd directory where directory is the name of a directory in your computers directory tree will change the current working directory to the one you specified. For example, if you started your ftp session in your home directory, but want to download a file into the subdirectory "files", you would use lcd files. This command has no bearing on the directory that you are working in on the remote machine.

If you find that you need to create a new directory in your computers directory tree during your ftp session (a not that uncommon occurrence), use lmkdir directory where directory is the path to the directory your wish to create. This command is a close relative of the mkdir command, so you may want to take another look at mkdir for further advice.

Finally, to end your ftp session, type quit.

Top

7.2.2. Using ncftp

The essential commands available using ftp are basically identical to the ones available in ncftp, but the addition of several features, like download progress indicators, make ncftp superior to the standard Unix ftp client. Thus, you should see the section on the standard ftp client for a list of the commands that will be available during your ftp session, and note the additions or changes that you will find in ncftp here.

When you begin your ncftp session, if you do not specify a username on the command line, ncftp will try to log you in as anonymous. Thus, when you start your ncftp session, the proper syntax is ncftp -u username hostname where username is the username that you use to log onto the system, and hostname is the name of the remote computer you are connecting to. Aside from the improved interface and features that ncftp offers over ftp, this should be the only significant difference you encounter. So as I said above, see the section on the standard ftp client for some of the more common commands available in an ncftp session.

Top

7.3. Connecting to a Remote Computer with Telnet

Telnet is severely insecure, do not use it!

The syntax for telnet is telnet hostname where hostname is the name of the remote computer. You will be prompted for your username and password, but after that using telnet should be just like being at the command line, because you are at the command line of a remote computer. Although some versions of programs and a few syntactic quirks do vary among Unix cousins, so there may be a few inconsistencies, but you should be able to recognize everything without much trouble.

It does, however, bear repeating that telnet is insecure. When you log in through telnet, your password is sent unencrypted over an open connection, and anyone who is listening will be able to get your log in information with very little effort. For this reason, many sys admins have disabled telnet on their systems, but even if it is available, you should not use it.

The secure alternative to telnet is a program called Secure Shell (ssh), which comes pre-installed on many modern Unix related systems. It effectively allows you to do the same things that telnet will allow you to do, but encrypts transmissions, so the threat of someone snooping on your connection is dramatically lessened.

Top

7.4. Using the Secure Shell (ssh)

The Secure Shell is a secure alternative to telnet. When you use the client program, it encrypts data, which is then decrypted when received by the ssh server. For more details, and for information on how to download and install ssh on your system, check out OpenSSH, but remember that most distributions of Unix and Linux will come with it pre-installed these days, so you might want to try running it before you start downloading anything.

To connect to a remote host with ssh, use ssh -l username hostname where username is your username on the remote system, and hostname is the name of the remote system that you are connecting to. The first time you connect to the remote computer, or whenever the remote computer changes its encryption key, you will be asked if you trust the remote site to be who they claim to be, and will be asked if you want to save their key. Generally it is okay to do this, but if you really believe that someone may have hijacked the domain of the system and is just trying to trick you into giving out your username and password (both require a considerable amount of effort, and so are fairly unlikely occurrences), say no and disconnect.

Once you are connected through ssh, just like with telnet, you are connected to the remote computers command line, and so everything should work as normal.

Secure Shell does offer a number of advanced features that you may want to look into further, and aside from the man page, the OpenSSH website has a wealth of information on the subject, so look them up when you want to make your very own encryption key pair (oooooh, so cypherpunkish).

Top

7.5. Web Browsing on the Command Line (?!?)

Yes, it is possible to browse the web from the command line. This is do able thanks to lynx, which is a lightweight text only browser that you can use for reading both on and off line web content from the comfort of your command line. This is of special use to anyone who may need to browse through an HTML HOWTO when X is not available to them, but the ability to browse the web from a command prompt can also be used for impressing your friends, members of the opposite sex, and anyone who thinks they're 1337 but really is a 7001. Okay, so maybe not, but it still comes in hella handy when you don't have X to fall back on.

The syntax to start a lynx session is lynx hostname where hostname is the name of the web page that you are trying to get to. You can follow links forward by pressing the right arrow key when you have selected a link, and can go back using the left arrow key. Pressing the space bar will move one page down on the current page. To quit lynx, type Q.

Just remember that lynx is a text only browser, so you won't have any pictures, and you will have to visit frames like following any other link.

That should be an effective introduction to the various ways of connecting to remote computers from the command line. As always, check the man page of a program for more options. Just remember that if your connection to the network is not working, then none of these network connection utilities will work either; I know it sounds obvious, but I also happen to know that a good number of people in the world are oblivious to the obvious.

Top

8. Working with File Compression

If you've been using Windows, then you're already familiar with file compression, most likely the zip format. You can take a file and run a zip program on it to shrink the file size somewhat (making it easier to store/download), and then when you need to open the file again, you unzip it and extract the contents.

If you're going to be running Linux and downloading stuff, getting familiar with the Unix preferred compression method is essential, and it can be a somewhat intimidating task because it is a 2 part process. The end result of this process is what is known as a "tarball" and will have either a *.tar.gz extension, or a *.tgz extension by convention, so if you see one of those types, you'll know what you're dealing with.

Top

8.1. Grouping Files with tar

History lesson time, kiddies. Once upon a time, data was stored on magnetic tape drives, that unlike modern disk drives, offered only sequential access. For a quick comparison, think of these data tapes as audio cassettes. In fact, these are still used today, usually because they offer much greater capacity per cassette than random access devices do.

As I've said before, Unix is old. One of the consequences of this is that many of the utilities that we use hearken back to a bygone era; tar is a prime example of this. For the curious, "tar" is short for "tape archive," which describes the function of the utility. Basically you use to put a bunch of files together into one big file, but you can also group together directories in this manner (actually, that is the preferred method). Remember, with tar, no actual compression/decompression is taking place, all you are doing is assembling your files for compression.

Top

8.1.1. Creating a tar Archive

The syntax for creating a tar file is tar -cvf archive include_files where archive is the name of the tar file that we are creating, and include files is either a file, a list of files, a directory, or a list of directories. Like I said before, it is good etiquette to put similarly grouped files together in the same directory. Otherwise this can result in a real mess when you un-tar the thing. For example, to create a tar file of all of your text files, we would first put every text file in a directory (it can have any name we want) called "txt_files". Then we would use tar -cvf textfiles.tar txt_files. The result is a nice neat tar file ready for packaging.

Top

8.1.2. Appending to a tar Archive

If we want to add something to an already existing tar file, the syntax would be tar -rvf archive include_files where, as above, archive is the name of the tar file, and include_files is the name of a file, files, directory, or directories that we want to add to the archive. For example, say we want to add a second grouping of text files that were written after the creation of our archive; putting them in a directory called (again, completely arbitrary name on my part) "txt_files2", we would use tar -rvf textfiles.tar txt_files2.

Top

8.1.3. Viewing the Contents of a tar Archive

Another exceedingly common thing to do is to examine the contents of the tar file that you made before compressing it. That saves you the embarrassment of having to announce to the entire web (or maybe even worse, your comp. sci. professor) that the reason nothing works right is that you didn't double check your work. To examine the contents of a tar file, use tar -tvf archive where archive is the name of the tar file that we're examining. The contents of the tar file will then be printed to the screen for your examination.

Top

8.1.4. Unpacking a tar Archive

While all this info on putting together a tar archive may be of interest to you one day, you'll probably be unpacking other peoples' tar files long before you start packing up your own. To extract the contents of a tar file into your directory tree, use tar -xvf archive where archive is the name of the tar file that you're extracting. The contents of the tar file will then be dumped into the directory that you are currently working in. This is why it's considered good form to put everything in its own directory, if you don't you can really make a mess of other peoples' directories, and then they probably won't like you very much, so you can kiss your social life goodbye.

Bear in mind that you need to decompress that tarball before you can un-tar it.

Top

8.2. The Compression Factor

Okay, so now you have yourself a nice, neat little package, and now we can start to apply some compression. Or, taken another way, you've already downloaded a tarball, and need to decompress it before you can un-tar it. Since I'm pretending like you're practicing these techniques on your own right now, we'll look at compressing files before we talk about decompressing them.

Top

8.2.1. Compresing a File with gzip

gzip is the dujor compression utility on Unix and Linux systems. It is published by the GNU project, and as such is Free Software, and unless you are using a minimal distribution is most assuredly available on your system. To compress a file using gzip, use gzip filename where filename is the name of the file that we are compressing. The result will be a compressed version of filename with the extension ".gz" appended to the name of the file. Yep, it is that simple.

Top

8.2.2. Decompressing a File with gunzip

Riddle: how do you un-gzip a file? With gunzip of course. The basic syntax is every bit as simple as its counterpart: gunzip filename where filename is the name of a compressed file that you want to decompress.

Top

9. Wrap-up

As the sun is starting to set, I think it's time I was riding off into it, pardners. I sure hope you found my tale witty and informative (one if not both anyway). Remember that I've tried to keep things nice and simple for you here, and that, in most cases, each of the commands I've talked about have a heapin' helpin' of other options and parameters that can be passed to them to modify their behavior. Whenever you're curious about learning a different, perhaps quicker, way of doing something, take a look at the man page and peruse the list of options to see what you might be able to do.

And remember, keep back ups. You're going to screw up from time to time, and keeping fairly consistent back ups make the experience a lot less painful. Until next time we meet.....

Happy hacking!

--starX

Top

10. License

GNU Free Documentation License Version 1.2, November 2002

Copyright (C) 2000,2001,2002 Free Software Foundation, Inc. 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA Everyone is permitted to copy and distribute verbatim copies of