|
|
|
|
The filesystems on the SP are designed to be similar to the layout on the Cray. In brief, there are only two filesystems that are available for general use: the home directory filesystem, which excels at storing many small files, but has a very limited amount of space, and a large archive filesystem, which excels in performance and storing large files.
The primary difference between these two machines and most other high-performance machines (such as those at NCAR) is that none of our filesystems use a scrubber. This a couple advantages, the primary one being simplicity; your run your jobs from the same place that you keep the data. This also means that it's important to clean up after your jobs, especially if they leave a large amount of source files or object files (*.o) behind. Since all files get saved, it's important to remove the files that you don't need, and use "tar" to consolodate the run into a single archive file.
/sparch is the equivalent of /svarch on earth. In fact, earth's /svarch is mounted read-only on all SP nodes to make sure that your data is easily accessible across the platforms.
/sparch is accessible to all nodes in the SP, but it does not have the typical performance of a network filesystem -- it is much faster. /sparch should be used to store large files, and files that will require a lot of manipulation during your job run. Please avoid storing large amounts of small files in /sparch, since (like /svarch on the Cray) this will become very inefficient when the filesystem starts to archive to the tape library. This means that it's very important to clean up unnecessary files after your run, and to tar the entire directory up for storage.
/sparch has 210 GB of disk space, and uses hierarchical storage to migrate unused data to tape, like on earth. In addition, the data is backed up nightly, and backups are kept for about 2 months.
For storing small files (such as source code), your home directory is also accessible to all nodes. This is where your compiling should be done, since /home will be the fastest filesystem on splogin. Please limit your home directory usage to 100 MB. There are no quotas in place yet, so we ask that you monitor your usage carefully, as there are no safeguards against filling the filesystem.
It should be noted that this filesystem will be considerably slower on the nodes.
Each node has a separate /tmp filesystem that is world-writable. If your jobs need to use this directory, please make sure that they delete the temporary files after the run is complete. Do not use this filesystem for anything but small temporary files created by your run.
/home and /sparch are backed up nightly using TSM (Tivoli Storage Manager) to our Storagetek tape silo. Backups are kept for about 2 months, at which point they are discarded to make room for new backups.
Here is how the Cray and SP filesystems are cross-mounted for access to data.
| earth | SP | post.essc.psu.edu | other ECF Suns | |
| earth:/svarch | local | read-only | read-write | read-only |
| SP:/sparch | read-only | local | read-write | read-only |
Basically, the two filesystems are cross-mounted read-only everywhere to provide fast access to data. The exception is post.essc.psu.edu, which has read-write access to do postprocessing work.
sftp is part of the collection of tools that come with ssh. It is very similar to regular FTP, except that all transactions including your password are encrypted. Please use this tool if you are connecting to the SP from outside the Environment Institute network.
scp is another command that ships as part of the ssh unix client. It works exactly like rcp, except that it provides a much more secure means of doing so, and will work even when passwords are required. It can be faster and more convenient than ftp if you know exactly what files you want to transfer.
The downside of scp is that since it encrypts all of the data that it transfers, it can be considerably slower than ftp. This makes it less useful on files larger than a few megabytes.
Usage:
scp [[user@]hostname:]sourcefile [[user@]hostname:]destination
You can also specify more than one source file if the destination is a directory. Note that either the source or the destination (or both!) can be remote hosts, and you can specify alternate userids if they're not the same as the current machine.
splogin also offers traditional FTP services. Please be aware that since this is standard FTP, your password is sent in the clear across the network. Due to this liability, please use sftp (see above) instead. It's espeicially important not to use traditional FTP if you are connecting to the SP from outside the Environment Institute network.