CEG 233: Linux and Windows 

Lab on Security and Privacy

This lab and this article are intended as quick introduction to the issues of security and privacy brought on by the computer and Internet.

Table of Contents

  1. Educational Objectives
  2. Security and Privacy
    1. Cryptography in Security and Privacy
    2. Viruses, Worms and Trojans
    3. File Integrity Checkers
    4. Firewalls
    5. Secure Shell, ssh
    6. Passwords
    7. Harmful WebSites
    8. Email
    9. Private Files and Directories
  3. Lab Experiment
  4. Acknowledgements
  5. References

Educational Objectives

The objectives of this article are to make you  :

  1. Aware of the issues brought about by computers on security and privacy of individuals and organizations.
  2. Understand what viruses are and how they are detected
  3. Able to distinguish between viruses, worms and trojans.
  4. Understand the technical essentials of firewalls.

The objectives of this lab experiment are to make you :

  1. Learn techniques to improve security and privacy in your computer work.
  2. Use a few security/privacy tools.
  3. Aware of the security and privacy issues

Security and Privacy

Definitions of Computer Security

"Paranoia is our profession." -- Strategic Air command

"Security Incident: The attempted or successful unauthorized access, use, disclosure, modification, or destruction of information or interference with system operations in an information system." [From Department of Homeland Security]

"The protection of data, networks and computing power. The protection of data (information security) is the most important. The protection of networks is important to prevent loss of server resources as well as to protect the network from being used for illegal purposes. The protection of computing power is relevant only to expensive machines such as large supercomputers."  [From ZDNet]

Definitions of Privacy

"Why should you care if you have nothing to hide?" -- J. Edgar Hoover, Year?
"You already have zero privacy -- get over it." -- Scott McNealy, 1999, Sun Microsystems CEO

"Privacy: An individual's or organization's right to determine whether, when and to whom personal or organizational information is released. Also, the right of individuals to control or influence information that is related to them, in terms of who may collect or store it and to whom that information may be disclosed."  [From Department of Homeland Security]

"Privacy Rights: The specific actions that an individual can take or request to be taken with regard to the uses and disclosures of their information." [From Department of Homeland Security]

 "The degree to which an individual can determine which personal information is to be shared with whom and for what purpose. Although always a concern when users pass confidential information to vendors by phone, mail or fax, the Internet has brought this issue to the forefront. Web sites often have privacy policies that stipulate exactly what will be done with the information you enter. For more information, visit www.privacyalliance.org  and www.epic.org . Contrast with confidentiality, which deals with unauthorized access to data."  [From ZDNet]

"Civilization is the progress toward a society of privacy. The savage's whole existence is public, ruled by the laws of his tribe. Civilization is the process of setting man free from men."  -- A character in the book The Fountainhead (1943) by Ayn Rand, http://www.aynrand.org/

Ten Immutable Laws of Security

[From http://www.microsoft.com/technet/archive/community/columns/security/ essays/ 10imlaws.mspx?mfr=true ]

  1. If a bad guy can persuade you to run his program on your computer, it's not your computer anymore.
  2. If a bad guy can alter the operating system on your computer, it's not your computer anymore.
  3. If a bad guy has unrestricted physical access to your computer, it's not your computer anymore .
  4. If you allow a bad guy to upload programs to your website, it's not your website any more.
  5. Weak passwords trump strong security.
  6. A computer is only as secure as the administrator is trustworthy
  7. Encrypted data is only as secure as the decryption key
  8. An out of date virus scanner is only marginally better than no virus scanner at all.
  9. Absolute anonymity isn't practical, in real life or on the Web.
  10. Technology is not a panacea.

Cryptography in Security and Privacy

Cryptography is a central pillar in providing confidentiality (and hence security and privacy).  This section is an accessible introduction to the field.  The date needing to be protected is encrypted using carefully guarded keys.  The encrypted data is so extremely hard ("computationally infeasible") to decode without the keys that it is considered protected from prying eyes.

A cryptographic encryption algorithm, also known as cipher, transforms a "plain text" (e.g., human readable) pt and outputs cipher text ct as the output, ct = cipher(pt, key), so that it is possible to re-generate the pt from the ct through a companion decryption algorithm.  Note that we said "for example, human readable" and not "that is, human readable" as an explanation for the phrase  "plain text".  In other words, the so-called "plain text" may be human un-readable binary data that is ready-to-use by a computer.

Ciphers use keys together with plain text as the input to produce cipher text.  It is in the key that the security of a modern cipher lies, not in the details of the algorithm, which are public knowledge.

What does "Computationally Infeasible" mean?

Roughly speaking, computationally infeasible means that a certain computation that we are talking about takes way too long (hundreds of years)  to compute using the fastest of (super) computers. 

Suppose our key is a 128-bit number.  There are

340,282,366,920,938,463,463,374,607,431,768,211,456

128-bit numbers starting from zero (i.e., 128 bits of 0).  To recover a particular key by brute force, one must, on average, search half the key space:

170,141,183,460,469,231,731,687,303715,884,105,728.

If we use 1,000,000,000 machines that could try 1,000,000,000 keys/sec, it would take all these machines longer than the universe as we know it has existed to find the key.

This is not the same thing as saying that computational infeasibility is the same idea as Turing-incomputable. Nor is it the same thing as saying that you cannot make a lucky guess, or heuristically arrive at a possible answer, and then systematically verify that the guessed answer is indeed the correct answer, all done within a matter of seconds on a lowly PC.  Here is an example:  Microsoft Windows NT uses the DES encryption algorithm in storing the passwords. Brute-forcing such a scrambled password to compute the plain text password can take, according to Microsoft, "about a billion years." But the L0pht team (www.l0pht.com) claims that L0phtCrack breaks Windows NT passwords in about one week, running in the background on a Pentium 200 based PC.

In the context of cryptography, the factorization of an arbitrarily large number N, into its constituent primes, determining the powers n2, n3, n5, n7, etc. of the primes, is computationally infeasible -- as far as we know.

N = 2n2 * 3 n3* 5 n5 * 7 n7* ...

Based on this, the decryption is computationally infeasible.

One way hash function

A one-way hash function takes a variable-length input sequence of bytes and converts it into a fixed-length sequence. The fixed length is considerably shorter than the typical length of the input, and hence the function is a hash function. The "one way" means that the function is designed to be computationally infeasible to reverse the process, that is, to discover, with no prior information other than the hash, a string that hashes to a given value. 

The nature of all hash functions is that there must exist multiple input sequences that map to the same hash.  The inverse is a mathematical relation, not a mathematical function. But, a good hash functions have the following properties: It is rare to find two strings, from the expected set of typically used strings, that would produce the same hash value.  A slight change in an input string causes the hash value to change drastically. E.g., if we flip one bit in the input string, at least half of the bits in the hash value flip as a result, then the hash is a good one.

One-way hash functions are also known as message digests (MD), fingerprints, or compression functions. The most popular one-way hash algorithms are MD4 and MD5 (both producing a 128-bit hash value), and SHA, also known as SHA1 (producing a 160-bit hash value).

Symmetric-key cryptography

Symmetric-key cryptography is an encryption system in which the sender and receiver of a message share a single, common key to encrypt and decrypt the message.  Symmetric-key systems are simpler and faster, but their main drawback is that the two parties must somehow exchange the key in a secure way. Symmetric-key cryptography is sometimes also called secret-key cryptography.

If ct = encryption (pt, key), then pt = decryption (ct, key).

DES

The most popular symmetric-key system is the DES, short for Data Encryption Standard.  DES was developed in 1975 and standardized by ANSI in 1981 as ANSI X.3.92. DES encrypts data in 64-bit blocks using a 56-bit key.  The algorithm transforms the input in a series of steps into a 64-bit output.

IDEA

IDEA (International Data Encryption Algorithm) is a block cipher which uses a 128-bit length key to encrypt successive 64-bit blocks of plain text. The procedure is quite complicated using subkeys generated from the key to carry out a series of modular arithmetic and XOR operations on segments of the 64-bit plaintext block. The encryption scheme uses a total of fifty-two 16-bit subkeys.

Blowfish

Blowfish is a symmetric block cipher that can be used as a drop-in replacement for DES or IDEA. It takes a variable-length key, from 32 bits to 448 bits, making it ideal for both domestic and exportable use.  Blowfish is unpatented and license-free, and is available free for all uses.

Public-key Encryption

Public key cryptography uses two keys -- a public key known to everyone, and a private or secret key known only to the recipient of the message. Public key cryptography was invented in 1976 by Whitfield Diffie and Martin Hellman. For this reason, it is sometimes also called Diffie-Hellman encryption. It is also called asymmetric encryption because it uses two keys instead of one key. The two keys are mathematically related, yet it is computationally infeasible to deduce one from the other.

Two well known algorithms for generating the keys are RSA and DSA.

RSA

The most well-known of the public-key encryption algorithms is RSA, named after its designers Rivest, Shamir, and Adelman. The un-breakability of the algorithm is based on the fact that there is no efficient way to factor very large numbers into their primes. 

  1. Find two primes, p and q.
  2. Compute the product, n = p*q (called, the public modulus).
  3. Choose e (the public exponent), such that (i) e < n, and (ii) e is relatively prime to (p-1)*(q-1).
  4. Compute d (the private exponent) such that (e*d)  mod (p-1)*(q-1) = 1.

Then, the public-key is (n, e), and the private key is (n, d). The public key is published in well known places; the private key must be safe guarded.  If the number n is small, p and q are easy to discover.  Thus, p and q are chosen to be as large as possible, say, more than a hundred digits long.  Obviously,  p and q should never be revealed, preferably destroyed.

Encryption is done as follows.  Consider the entire message to be encrypted as a sequence of bits.  Suppose the length of n in bits is b.  Split the message into blocks of length b or b-1.  A block viewed as a b-bit number should be less than n; if it is not, choose it to be b-1 bits long.  Each block is separately encrypted, and the encryption of the entire message is the catenation of the encryption of the blocks.  Let m stand for a block viewed as a number.  Multiply m with itself e times, and take the modulo n result as c, which is the encryption of m.  That is,   c = m^e mod n.

Decryption is the "inverse" operation:  m = c^d mod n.

The only way known to find d is to know p and q.  The e and n are the public key, which is published, while d is the private key, which must be kept secret. The e and d are symmetric in that using either as the encryption key, the other can be used as the decryption key.

Secure Communication Using Public Keys

Public-key systems, such as Pretty Good Privacy (PGP), are popular for transmitting information via the Internet. They are extremely secure and relatively simple to use.  You, of course, need to know the recipient's public key to encrypt a message. A global registry of public keys is needed, which is one of the services of the LDAP technology.

When John wants to send a secure message to Jane, he uses Jane's public key to encrypt the message. Jane then uses her private key to decrypt it.   Anyone with the public key can encrypt a message but not decrypt it with it. Only the person with the private key can decrypt the message.  If a message is encrypted with someone's private key, it can only be decrypted with the corresponding public key.

In real-world implementations, public keys are rarely used to encrypt actual messages because public-key cryptography is slow. about 1000 times slower than symmetric key  cryptography.

Instead, public-key cryptography is used to distribute symmetric keys, which are then used to encrypt and decrypt actual messages, as follows:

  1. Bob sends Alice his public key.
  2. Alice generates a random symmetric key (usually called a session key), encrypts it with Bob's public key, and sends it to Bob.
  3. Bob decrypts the session key with his private key.
  4. Alice and Bob exchange messages using the session key.

Digital Signatures

A digital signature is a way to authenticate to a recipient that a received object is indeed that of the sender.

  1. Alice computes a one-way hash of a document.
  2. Alice encrypts the hash with her private key. The encrypted hash becomes the document's signature.
  3. Alice sends the document along with the signature to Bob..
  4. Bob produces a one-way hash function of the document received from Alice, decrypts the signature with Alice's public key, and compares the two values. If they match, Bob knows that: (1) the document really came from Alice and (2) the document was not tampered with during transmission.

Man-in-the-Middle Attack

The public key-based communication between Alice and Bob described above is vulnerable to a man-in-the-middle attack.

Let us assume that Mallory, a cracker, not only can listen to the traffic between Alice and Bob, but also can modify, delete, and substitute Alice's and Bob's messages, as well as introduce new ones.  Mallory can impersonate Alice when talking to Bob and impersonate Bob when talking to Alice. Here is how the attack works.

  1. Bob sends Alice his public key. Mallory intercepts the key and sends her own public key to Alice.
  2. Alice generates a random session key, encrypts it with "Bob’s" public key (which is really Mallory's), and sends it to Bob.
  3. Mallory intercepts the message. He decrypts the session key with his private key, encrypts it with Bob's public key, and sends it to Bob.
  4. Bob receives the message thinking it came from Alice. He decrypts it with his private key and obtains the session key.
  5. Alice and Bob start exchanging messages using the session key. Mallory, who also has that key, can now decipher the entire conversation.

A man-in-the-middle attack works because Alice and Bob have no way to verify they are talking to each other. An independent third party that everyone trusts is needed to foil the attack. This third party could bundle the name "Bob" with Bob's public key and sign the package with its own private key. When Alice receives the signed public key from Bob, she can verify the third party's signature. This way she knows that the public key really belongs to Bob, and not Mallory.

Digital Certificates

A package containing a person's name (and possibly some other information such as an E-mail address and company name) and his public key and signed by a trusted party is called a digital certificate (or digital ID). An independent third party that everyone trusts, whose responsibility is to issue certificates, is called a Certification Authority (CA).  A digital certificate is a means of binding the details about an individual or organization to a public key. A digital certificate serves two purposes. First, it provides a cryptographic key that allows another party to encrypt information for the certificate's owner. Second, it provides a measure of proof that the holder of the certificate is who they claim to be - because otherwise, they will not be able to decrypt any information that was encrypted using the key in the certificate.

The recipient of an encrypted message uses the CA's public key to decode the digital certificate attached to the message, verifies it as issued by the CA and then obtains the sender's public key and identification information held within the certificate. With this information, the recipient can send an encrypted reply.

The most widely used standard for digital certificates is X.509, which defines the following structure for public-key certificates:

  1. Version field identifies the certificate format.
  2. Serial Number unique within the CA.
  3. Signature Algorithm identifies the algorithm used to sign the certificate.
  4. Issuer Name is the name of the CA.
  5. Period of Validity is a pair of Not Before Date, and Not After Dates
  6. Subject Name Subject is the name of the user to whom the certificate is issued
  7. Subject's Public Key field includes Algorithm name and the Public Key of the subject.
  8. Extensions
  9. Signature of CA.

You can obtain a personal certificate from companies like VeriSign www.verisign.com or Thawte www.thawte.com.

Viruses, Worms and Trojans

Unix.  The world's first computer virus.
Title of Chapter 1 of The Unix Haters Handbook, ISBN: 1-56884-203-1

The above book is in fact written by serious computer scientists.  Nevertheless, we must disregard the suggestion that Unix is a virus as an attempt at being hilarious.  Equally unhelpful are the news media that use the term virus in referring to any piece of malicious software. The academic world uses the term "malware'' for these.  Rigorous definitions have been given by many computer security experts but they do not match the typical use even by other security experts.  Thus, we must settle for practical "definitions" of malicious software.

Definitions

Macro Viruses

Macro languages are (often) equal in power to ordinary programming languages such as C.  A program written in a macro language is interpreted by the application.  Macro languages are conceptually no different from so-called scripting languages.  Gnu Emacs uses Lisp, most Microsoft applications use Visual Basic Script as macro languages. The typical use of a macro in applications, such as MS Word, is to extend the features of the application. Some of these macros, known as auto-execute macros, are executed in response to some event, such as opening a file, closing a file, starting an application, and even pressing a certain key.  A macro virus is a piece of self-replicating code inserted into an auto-execute macro. Once a macro is running, it copies itself to other documents, delete files, etc.  Another type of hazardous macro is one named for an existing command of the application.  For example, if a macro named FileSave exists in the "normal.dot" template of MS Word, that macro is executed whenever you choose the Save command on the File menu. Unfortunately, there is often no way to disable such features.

Unix/Linux Viruses

The most famous of  the security incidents in the last decade was the Internet Worm incident which began from a Unix system.  But Unix systems were considered virus-immune -- not so.  Several Linux viruses have been discovered. The Staog virus first appeared in 1996.   Now that Android (a Linux based OS for hand held devices) has become popular, we will see many more Linux viruses.  See http://en.wikipedia.org/wiki/Linux_malware.

Spreading Malware via the Internet

Whereas a Trojan horse is delivered pre-built, a virus infects.  In the past, such malicious programs arrived via tapes and disks, and the spread of a virus around the world took many months.  Antivirus companies had time to identify a new viral strain, and create cleaning procedures.  Today, Trojan horses, worms and viruses are network deliverable as E-mail, Java applets, ActiveX controls, JavaScripted pages, CGI-BIN scripts, or as self-extracting packages. 

Integrated mail systems such as Microsoft Outlook make it very simple to send not only a quick note edited within a limited text editor but also previously composed computer documents of arbitrary complexity to anyone, and to work with objects that you receive via standards such as MIME. Java and ActiveX are now integrated  into mail systems. Both Java and ActiveX have been found to have security bugs.

Virus Detection

Known viruses are by far the most common security problem on modern computer systems. Several web sites maintain complete lists of known viruses.  There are thousands.  Visit, e.g., www.cai.com/ virusinfo/ encyclopedia/.  In the month of Nov 2010, there were 600+ "PC Viruses in the Wild" (www.wildlist.org).  Virus detection programs analyze a suspect program for the presence of known viruses.

Fred Cohen has proven mathematically that perfect detection of unknown viruses is impossible: no program can look at other programs and say either "a virus is present" or "no virus is present", and always be correct. But, in the real world, most new viruses are sufficiently like old viruses that the same sort of scanning that finds known viruses also finds the new ones. And there are a large number of heuristic tricks that anti-virus programs use to detect new viruses, based either on how they look, or what they do.

Virus scanners are sometimes classified by their "generation."  The first generation virus scanners used previously obtained a virus signature, a bit pattern, to detect a known virus. They record and check the length of all executables. The second generation scans executables with heuristic rules, looking, e.g., for fragments of code associated with a typical virus. They also do integrity checking by calculating a checksum of a program and storing somewhere else the encrypted checksum. The third generation use a memory resident program to monitor the execution behavior of programs to identify a virus by the types of action that the virus takes. The fourth Generation Virus Detection combines all previous approaches and includes access control capabilities.

File Integrity Checkers

Virus infections change program files.  Trojan programs are replacements for legitimate programs.  A necessary protection measure on any computer system where programs and other files are stored on a read-and-write media such as hard disks is to make sure that files have not changed.

File integrity checking is about verifying that files have not been altered.  This includes an examination of the file content and meta information such as time stamps and permissions.  Immediately after a fresh install, it is expected that the meta information regarding the installed files is recorded on a write-once medium such as a DVD so that it can be compared with later.

Checking that file content has not been altered is very time consuming if we compare with original content either file by file or an entire partition on a hard disk.  Thus, typically we compute MD5 sums of the files immediately after a fresh install, and periodically re-compute the sums and compare.

Note that MD5 sum is no longer reliable as a result of recent developments.  Nevertheless, two alternate sums such as MD5 sum and SHA1 sum offer dependable detection of content change.

Firewalls

In the context of buildings, a firewall is a fireproof wall intended to prevent the spread of fire from one room or area of a building to another.   It has acquired a related but an outside-to-inside attack prevention meaning in the context of the Internet.  A typical intranet these days is not connected to the Internet directly.  Instead, we connect it to a firewall, and channel all transmissions through the firewall.  

A firewall is a computer system dedicated to protect a LAN from the Internet at large.  It is at the entry point of the LAN it protects. All traffic from/to the LAN to/from any host on the Internet at large goes through a firewall.  They receive, inspect and make decisions about all incoming traffic before it reaches other parts of the system or network. They regulate outgoing traffic also.

A firewall package can be (should be) run on laptop/desktop also.

A rigorous definition of what a firewall is difficult as the term has been used with a variety of meanings by the internet security industry.  It can be a simple packet filter to an enormously complex computer system with extensive logging systems, intrusion detection systems, etc.

Assumptions

  1. Firewalls make the assumption that the only way in or out of a corporate network is through the firewalls; that there are no "back doors" to your network. In practice, this is rarely the case, especially for a network which spans a large enterprise. Users may setup their own backdoors, using modems, terminal servers, or use such programs as "PC Anywhere" so that they can work from home. The more inconvenient a firewall is to your user community, the more likely someone will set up their own "back door" channel to their machine, thus bypassing your firewall.
  2. Firewalls make the assumption that all of the bad guys are on the outside of the firewall, and everyone on the inside of the can be considered trustworthy. This neglects the large number of computer crimes which are committed by insiders.

Packet Filters

A Packet Filter filters packets (i.e., IP datagrams) based on certain rules.   This is the simplest of the firewalls.   The type of router used in a packet filtering firewall is known as a screening router (figure from Garfinkel and Spafford's book).  A screening router determines not only whether or not it can route a packet towards its destination, but also whether or not it should. "Should" or "should not" are determined by the site's security policy.

[Screening router diagram from Garfinkel&Spafford]
Using a screening router to do packet filtering

Here are some examples of filtering: 

  1. Block all incoming connections from systems outside the internal network, except for incoming SMTP connections (so that you can receive email).
  2. Block all connections to or from certain systems you distrust.
  3. Allow e.g., SSH, SFTP, and HTTP services, but block "dangerous" services like TFTP, the X Window System, RPC, and the "r" services (rlogin, rsh, rcp, etc.).

 

The Windows built-in  firewall is  of this type. You can configure it to pass or drop packets as above.  Most wireless-AP-routers also have this feature.  Note that a packet filter does not deal with files;  it examines the details of each packet:  the source and destination IP addresses, port numbers, etc.  The filtering rules are based patterns matching these details.  Thus, a packet filtering firewall cannot detect if viruses, worms, or Trojans are being downloaded.

Content Filters

Content filters work at a level higher than packet filters.  E.g., (i) certain URLs can be blocked, (ii) as a program file is being downloaded it can be scanned for the presence of virus signatures.  But, newly evolving systems are blurring the lines between data and executables more and more. With  macros, JavaScript, Java, ActiveX and other forms of executable fragments which can be embedded inside data, a security model which neglects this will leave you wide open to a wide range of attacks.

Example Firewalls

It is highly recommended that you experiment with several firewalls in both Linux and Windows.  In Windows, experiment with

  1. the built-in firewall ( http://windows.microsoft.com/en-US/windows7/products/features/windows-firewall )
  2. Agnitum Outpost Free Firewall (web search for links)
  3. ZoneAlarm Personal Free (web search for links)

In Linux, the command-line based net filters/IPtables are standard.  There are many GUI front ends.  You should experiment with

  1. http://www.shorewall.net/
  2. http://www.fs-security.com/
  3. http://www.ipcop.org/

Passwords

Most system administrators generate, for their users, initial passwords that are hard to remember.  Soon after the users login, they change their passwords to something they prefer.  These range from names and birth dates of spouses, friends, relatives and friends to whatever.  An attacker who "stalks" a user often does a little bit of snooping around to discover these.

A recent survey by ... in the financial district of ... showed that poor choices are the norm for computer passwords there. A staggering 82% of the respondents said they used, in order of preference, ``a sexual position or abusive name for the boss" (30%), their partner's name or nickname (16%), the name of their favorite holiday destination (15%), sports team or player (13%), and whatever they saw first on their desk (8%).

Most users have the same password for their accounts on different systems.  An attacker who broke into one account usually discovers these other accounts by going through the memoranda that the user keeps in his files as well as by running keystroke loggers or simple sniffers.

You should chose a password that is not susceptible to a dictionary attack, or social engineering attack.  A password should be a mix of letters and digits.  At least 8 characters long.  Obviously, it should be such that you can remember it without writing it down.  Changing this periodically, even when no known password cracking incident, is often forced upon the users by some administrators, but this is controversial.

Don't Use the following for passwords: Your first name. Your last name. Your login name. Your pet's name. Any name at all. SS number. House number.  Telephone number. Your bank PIN. Any password shorter than six characters.

Social engineering is a "term used among crackers and samurai for cracking techniques that rely on weaknesses in wetware rather than software; the aim is to trick people into revealing passwords or other information that compromises a target system's security. Classic scams include phoning up a mark who has the required information and posing as a field service tech or a fellow employee with an urgent access problem." [ http://info.astrian.net/ jargon/ terms/s/social_engineering.html ]

System crackers often encrypt a dictionary of words and common passwords using all possible 4096 salt values. Then they will compare the encoded passwords in your /etc/passwd or /etc/shadow file with their database. Once they have found a match, they have the password for another account. This is one of the most common methods for gaining or expanding unauthorized access to a system. Good machine-readable collections of dictionaries are essential for cracking, and can be found easily on the web.  An 8 character password encodes to one of 4096 * 13 character strings. So, a dictionary of say 500,000,000 common words, names, passwords, and simple variations would easily fit on a 500 GB hard drive. The attacker need only sort them, and then check for matches.  A 500 GB hard disk now (Nov 2007) sells for about $100.

Secure Shell, ssh

Most compromises in security and privacy now happen through networks.  Even though CEG233 is an early course in a degree program, and precedes a proper course on Computer Networks, you should begin reading about networks.

We recommend that ssh be used in place of telnet, rlogin, rsh, rcp, etc. The file transfer program sftp is based on ssh. The current method (IPv4) of communicating between machines allows anyone to sniff the packets on the network. Passwords and all data are sent along in plain text and can be readily captured and analyzed. Secure shell foils sniffing attempts by encrypting the packets (using ciphers) and by only allowing connections with known machines (using RSA public key technology to authenticate). In general, it does not trust the network. However, an attacker can gain root access, through other means such as man in the middle attack, to either the local machine or the remote machine.

See the Lab on Networks; it includes ssh details.

Visiting Web Sites

Search the web and discover answers to the following questions.  Note that some answers will be specific to the web browser.

  1. What are the dangers of visiting sites?
  2. Can a web page download/ find and invoke a program stored on your hard disk?
  3. What are cookies? Why do web sites deposit them?  Should you keep them for ever? How can you examine them? How can you delete them?
  4. You must have seen: "This site may harm your computer."  Learn why google and other search engines mark some websites so.
  5. How do you know that the answers you found on the web for the above are trustworthy?

Email

There are many excellent email security tutorials.  One of them is listed in the References.

Private Files and Directories

Learn the details of the following commands from our text book, from man pages, and by searching for them on the web.

  Linux Windows Brief description/Learning Objective Limitations, if any
  md5sum   Computes the hash of a file content using MD5 cryptographic algorithm
  sha1sum   Computes the hash of a file content using SHA cryptographic algorithm
       
  undelete undelete Recover an accidentally deleted file; not always successful; non-standard command
  shred shred "Shred" a file; non-standard command
  wipe wipe Wipe out a file;

Lab Experiment

In Windows, download from http://portableapps.com/  and install into your USB drive all of the following security/privacy tools.  The first three are essential and we are sure you will use them many times.  We use the remaining tools in this Lab.

  1. WinSCP Portable - SFTP, FTP and SCP client
  2. FileZilla Portable - the full-featured SFTP client
  3. PuTTY Portable - lightweight telnet and SSH client
  4. ClamWin Portable - Antivirus on the go; 
  5. Eraser Portable - securely delete files and data; 
  6. KeePass Password Safe Portable - Secure, easy-to-use password manager; 
  7. winMd5Sum Portable - check md5 sums to verify files on the go; 
  8. Toucan - backup, sync and encrypt for advanced users.

All work is expected to be carried out in the Operating Systems and Internet Security (OSIS) Lab, 429 Russ.   But, you are welcome to work wherever.  Note that use of both Linux and Windows and other software, that may not always be installed in other facilities, may be needed.

Record the lines you type and your observations, as always, in a plain text file named myLabJournal.txt.  See the grading sheet, and include appropriate portions from myLabJournal.txt  into answers.txt. All descriptions asked for also go into this file.

In Linux

  1. Create a text file named myInfo.txt  in your home directory containing exactly four lines: Your full name, your UID, your email address, and the darkest wish ;-) you have, each on a separate line.  Make sure that this file is strictly for your eyes-only.  Not even the super-user should be able to read it.  Record how you did it.  Copy this file to your USB thumb drive.
  2. Compute the md5sum of myInfo.txt. Change just one or two characters in this file, and re-compute the md5sum.   See if you can change this file so that even after the change the md5sum comes out the same as before.  Try a few times (say 10).  Record your trials.
  3. Learn the details of the shred command.  Use the -v flag and describe how it securely deleted a file.
  4. Search the web, learn, and describe the purpose of /etc/hosts, /etc/hosts.deny and /etc/hosts.allow files.
  5. Invoke the web browser you have been using all this time.  Locate and copy the history it has recorded into the journalDescribe what steps you can take to reduce/eliminate this history keeping.

In Windows

  1. Copy  myInfo.txt  that you saved above on your USB thumb drive to Windows TEMP directory. Make sure myInfo.txt is strictly for your eyes-only.  Not even the administrator (super-user) should be able to read it.  Record how you did it.
  2. Using winMd5Sum Portable perform Step 2 of Linux above.
  3. Use Eraser Portable to securely delete a file and compare it with shred.
  4. Use and then write a short (say around 10 lines)  how-to on KeePass Password Safe Portable.
  5. Use and then write a short (say around 10 lines)  how-to on ClamWin Portable.

Visit a few Sites

  1. Vist http://anonymouse.org/  and experience their service.  Record your observations and opinions.
  2. Visit  www.cnn.com   Read a few stories, say for 5 minutes.  Discover if this site has deposited any cookies.  Where?  If it did, copy them as "regular" text lines into the journal, and describe their content as best as you can.
  3. Discover the answers to questions posed in Harmful WebSites
  4. Spend a few minutes browsing the site http://packetstormsecurity.org/  and describe what kind of a site it is.
  5. Search and find a serious violation of privacy that happened in the last 12 months.

Turnin

Note the number <n> of this Lab from the course home page and use L<n> as the first argument to turnin.

Acknowledgements

References

  1. Wikipedia, Email, http://en.wikipedia.org/wiki/Email Required Reading.
  2. Tim Richardson, Simple Notes on Internet Security and Email, http://www.tim-richardson.net/misc/security.html   Recommended reading.
  3. Microsoft Research, "Trends in Cybercrime, " (http://www.technologynewsdaily.com/ node/8335).  Recommended reading.
  4. SANS Top-20 Internet Security Attack Targets, https://www2.sans.org/top20/ Required Visit.
  5. Microsoft, 5-Minute Security Advisor, http://technet.microsoft.com/en-us/library/dd310373.aspx  Required Visit.
  6. Link to Grading Sheet

Copyright © 2010 Prabhaker Mateti