/* * Copyright (c) 2001 BBNT Solutions LLC * * Permission to use, copy, modify, and distribute this software * and its documentation for any purpose is hereby granted without * fee, provided that the above copyright notice and this permission * appear in all copies and in supporting documentation, and that the * name of BBN Technologies not be used in advertising or publicity * pertaining to distribution of the software without specific, * written prior permission. BBN makes no representations about the * suitability of this software for any purposes. It is provided "AS * IS" without express or implied warranties. */ /* * Copyright (c) 2006--2019 International Computer Science Institute * * Permission is hereby granted, free of charge, to any person * obtaining a copy of this software and associated documentation files * (the "Software"), to deal in the Software without restriction, * including without limitation the rights to use, copy, modify, merge, * publish, distribute, sublicense, and/or sell copies of the Software, * and to permit persons to whom the Software is furnished to do so, * subject to the following conditions: * * The above copyright notice and this permission notice shall be * included in all copies or substantial portions of the Software. * * The names and trademarks of copyright holders may not be used in * advertising or publicity pertaining to the software without specific * prior permission. Title to copyright in this software and any * associated documentation will at all times remain with the copyright * holders. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE * SOFTWARE. */ stats v2.8.2 Mark Allman (mallman@icir.org) May 2019 The aim of this program is to do small amounts of data manipulation and provide basic statistics for datasets. For in-depth statistical analysis you should probably look at another utility. This program started as a small pile of perl when I was an undergrad (stats v0), progressing to a nice little C utility when I was in grad school (stats v1) and then to a slightly nicer and more featureful pile of C (if there is such a thing!) after grad school. If you add something useful to the program, please drop me a note and I'll add it to the code base. If you find a bug, I'd appreciate hearing about that, as well. The original "stats" program has been augmented over the years by several additional small tools that are included. Each tool is described below. Building is as easy as running "make". The tool requires the GNU "readline" library -- which I assume is fairly standard these days. After building you can re-locate the binaries to wherever you like to keep such things (~/bin, /usr/local/bin, etc.). I routinely use stats under FreeBSD and OSX. I and have also used it under NetBSD, Linux and Solaris in the past. I expect that it should compile and run fine under any Unix variant. Changes since version 2.4 are outlined in the ChangeLog file. In addition, there are several possibly useful items that are not compiled by defualt that are enumerated at the end of the ChangeLog, for the interested. STATS ===================================================================== The tool expects input to be one data point per line. When using the command line the "-" denotes that the input should come from standard input. Internally the data is kept in a big array of doubles. This makes lots of things easy (such as finding percentiles). However, the downside is that the memory usage is somewhat gross and since I dynamically increase the size of the array the program can be slow at times. If you have a lot of data and know it you can use the "-M X" option to give the tool a hint about the number of data items and therefore boost performance in that realloc() will not be called as much. Stats has three basic modes of operation. Each will be explained in turn. (1) Interactive Type "help" at the stats command line for a list of commands. (2) Batch mode (or command line mode) Type "stats -h" in the shell for command-line usage instructions. (3) Script mode You can write a script containing any of the commands you would give in interactive mode. Invoking "stats -f scriptname" will then execute the stats script. One of the bugs is that there is very little documentation. However, after figuring out a few things I am sure you'll agree that it is a fairly straightforward and flexible utility to use. I hope it serves you well. LESQ ===================================================================== This tool takes (x,y) points from the input file(s) (formatted as "xy") and performs Least Squares fitting of the data to produce the equation for the line that characterizes the data. The arguments on the command line are files to process (with "-" indicating that the tool should read from standard input). DUMPDIFF ===================================================================== This tool dumps the difference between subsequent samples of the input files. The format of the input files is one data point per line. The differences are written to standard output by default or some file when using the "-o filename" option. The input comes from the filename(s) listed on the command line (or, standard input if "-" is given). TRANSF ===================================================================== This tool takes input from files with one data point per line and then transforms these values by applying a user-specified function. There are a number of functions provided and more can be added fairly easily. As with the other tools, "-" indicates that input should be taken from standard input.