Copyright (C) 2002-2008 by Ichiroh Kanaya, PineappleDesign.org
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see http://www.gnu.org/licenses/.
This document describes how to use Vector Stream library and tools, that are distributed under GPLv3 by Ichiroh Kanaya.
Vector Stream provides file I/O APIs of vectors
(sequence of numbers) for the standard C language. Vector
Stream library stores C-style value array (float
and double
are supported at this moment) to a file
specified by a pointer to FILE
structure in its
unique format.
The file format provided by Vector Stream (vector file format) is quite simple, human-readable, and yet gives efficiency of streaming (file I/O).
Before presenting the format itself, let us see how the Vector Stream library works in your C-code.
By using the Vector Stream library, you can read/write arrays of numbers from/to a file with no pain. For example, if you have an array of float like
#define N 10 float x[N] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };then you can put it onto a file as follows.
vec_put_float_vector_to_file(N, x, 1, stdout);The format of the file created here is called "vector file format". Once the vector file created, you can read it as follows.
size_t n; float *y; vec_new_float_vector_from_file(&n, &y, stdin); some_function(y[0], y[1], y[2]); /* ... */Memory allocation is automatically done. After usage of
y
, you are required to free the memory manually.
/* ... */ vec_delete_float_vector(y);This is C.
The Vector Stream file format is quite simple, human-readable, yet supports efficient implementation of streamng library. The key idea of the vector fire format is to have minimum requirement to hold numerical arrays.
First, vector file is a text file. You can put only the follwoing terms in the file.
nil
Second, the Vector Stream file can only contains 1-dimensional array. Remember that multi-dimensional array is avoidable in most scientific fields. Multi-dimensional array is needed only if its elements has variable (or multiple) lengths. For this perpose, you may want to use object collection mechanism that you can see in Objective-C or in Java.
Third, you must start the number of elements before putting the elements of the array themselves. For example, if you have an array [0 1 2 3], then you must put
4 0 1 2 3into the file. This can greatly reduce memory allocation cost in the library since the library will know how much the array size will be.
You can put comment anywhere. For example, the following is an example of valid Vector Stream file.
% Comment started from the begining of the line 4 % Number of elements 1 % The first element % This is blank line. 2 3 % The second and third elements. % File end.
Fourth, you are encouraged to start all vector files with this header.
%!VCTR %-format=1.2The author also recommend that the filenames of all vector files end with ".v" suffixes.
Fifth, you are allowed to give some hints and messages to the vector files by the special form of comments as follows.
%?Hello, world. %*option=123The message started with
%?
might be printed on the
console during the processing by Vector Stream tools, might be
passed to the next vector file if the process formed
pipeline. The option string started with %*
might
be analized by Vector Stream tools; for example, the option
string could hold stride of the vector like the
following sample that is meaning ([1 2 3], [10 20 30], [100 200 300]).
%!VCTR %-format=1.2 %*dimension=3 %*cardinality=3 %*minimum=1:2:3 %*maximum=100:200:300 %*average=37:74:111 9 % Number of elements (3-dimensional vector x 3) 1 2 3 % The first vector 10 20 30 % The second vector 100 200 300 % The third vector % End of file.
The erly version of vector file format is proposed by the author in 2002, for the purpose of his research. In 2008, the format has been slightly modified while keeping backward comaptibility to the previous versions.
The vector file format version 1.2 was designed in 2008.
%*
metadata field appeared.%?
message field appeared.nil
for empty vector, instead
of 0
vector.The vector file format version 1.1 was designed in 2002. The version 1.1 of the format has the following features.
%!VCTR
header appeared.%-format=1.1
format identifier appeared.The vector file format was originally introduced in 2002. The version 1.0 of the format has the following restriction.
The Vector Stream library provides C89/C90 API and also provieds
C++03 API at this moment. The C API is accessible from C++ but
all functions in C API are located in the
namespace pid
.
The Vector Stream library provides the following functions. The
C APIs are defined in <vec.h>
header. C++
users can include <vec++.hh>
instead, to
use short-cuts to the C functions.
extern int vec_put_header_to_file(FILE *fout);This function puts valid header (i.e.
%!VCTR
...)
to fout
.
extern int vec_put_nil_to_file(FILE *fout);This function puts
nil
to fout
.
extern int vec_put_float_vector_to_file(size_t n, const float *v, size_t s, FILE *fout);
vec_put_float_vector_to_file
puts value
array v
of size n
to
file fout
. If s
is more than 1, this
funciton puts LF
every s
elements;
otherwise this function puts each elements line by line.
C++ users can use this function with the short-cut name
pid::put_vector
.
extern int vec_new_float_vector_from_file(size_t *n, float **v, FILE *fin);
vec_new_float_vector_from_file
gets
vector *v
of size *n
from
file fin
. Memory for *v
is
automatically allocated. C++ users can use this function with
the short-cut name pid::new_vector
.
extern void vec_delete_float_vector(float *v);After use of vectors, users are encouraged to clean up the memory by using this function. C++ users can use this function with the short-cut name
pid::delete_vector
.
extern int vec_put_double_vector_to_file(size_t n, const double *v, size_t s, FILE *fout); extern int vec_new_double_vector_from_file(size_t *n, double **v, FILE *fin); extern void vec_delete_double_vector(double *v);The above functions are
double
versions of the
put/new/delete functions. The short-cut names for C++ users are
same.
extern int vec_put_hint_to_file(const char *hint, int parameter, FILE *fout);This function puts hint to the file
fout
in the
format of %*hint=parameter
.
extern int vec_put_message_to_file(const char *message, FILE *fout);This function puts message to the file
fout
in the
format of %?message
. The \n
characters
in message
are ignored.
extern int vec_scan_messages_from_file_and_put_to_file(FILE *fin, FILE *fout);This function scans any messages (lines that start with
%?
) of file fin
and puts them to
file fout
.
extern int vec_slice_double_vector(double *a, const double *v, size_t offset, size_t length, size_t stride);This function slices vector
v
and stores the sliced
vector to a
. The memory for the sliced
vector a
must be allocated before calling
this function. The slicing parameters are
offset offset
, length length
, and
stride stride
. This function is an alternative to
C++'s std::slice
.
extern int vec_add_double_multi_vector_to_multi_vector(double *a, size_t s, size_t n1, const double *v1, size_t n2, const double *v2); extern int vec_add_double_single_vector_to_multi_vector(double *a, size_t s, size_t n1, const double *v1, size_t n2, const double *v2);These functions calculate sum of two vectors
v1
(size n1
) and v2
(size n2
) and store the result to a
.
The memory for resulting vector a
must be
allocated before calling these functions. The former
function takes the same size of vectors v1
and v2
and each elements are added and stored in
the corresponding positions of a
. The latter
function takes smaller size of vector as v1
and
repeats it for adding to v2
.
extern int vec_multiply_double_multi_matrix_to_multi_vector(double *a, size_t s, size_t nm, const double *m, size_t nv, const double *v, int transpose); extern int vec_multiply_double_single_matrix_to_multi_vector(double *a, size_t s, size_t nm, const double *m, size_t nv, const double *v, int transpose);These functions calculate products of matrices
v1
(size n1
, stride s
) and
vectors v2
(size n2
,
stride s
), and store the result to a
.
The memory for resulting vector a
must be
allocated before calling these functions. The former
function takes the same numbers of matrices v1
and
vectors v2
; each elements are multiplied and stored
in the corresponding positions of a
. The latter
function takes a single matrix as v1
and repeats it
for multiplying to v2
.
typedef int (*vec_error_handler_t)(int error_type, const char *error_message); extern vec_error_handler_t vec_set_error_handler(vec_error_handler_t new_error_handler);If some error occurs in Vector Stream library, a default error handler is invoked. The default error handler leaves error message on
stderr
and calls exit(1)
if
the error is fatal, otherwise returns. By using
the vec_set_error_handler
function you can modify
the default behavior (e.g., you can throw exception if you are
using C++). This function returns current error handler.
The Vector Stream library provides C++ APIs on top of C APIs.
The functions and templates are provided
through <vec++.hh>
header.
C++ API of Vector Stream provides short-cut names to the C APIs.
All C APIs and their short-cut versions are declared
in pid
namespace.
The C++ API provides class template vector_loader
.
The following example shows how to use this template.
#include <iostream> #include <vec++.hh> int main(int argc, char **argv) { pid::vector_loader<double> *vl = new pid::vector_loader<double>(argv[2]); std::valarray<double> *v = vl->values(); std::vector<std::string> *m = vl->messages(); std::vector<std::string>::const_iterator i = m->begin(); while (i != m->end()) { std::cerr << *i << '\n'; ++i; } std::map<std::string, std::string> *h = vl->hints(); std::cerr << (*h)["dimension"] << '\n'; delete vl; }Unfortunately current implementation of
vector_loader
scans the input
file twice. This spoils the advantage of Vector Stream
file format in terms of its efficiency. If the time is
critical, stick with C APIs for loading the file.
The following function template helps to break
collon-separated-values in valarray
.
template <typename T, typename C> void parse_multiple_parameters(std::valarray<T> **v, C converter, char *s);The
converter
is a functor (function-like
object) and must provide T operator () (const std::string
&)
. For example if the parameter was s =
"1:2:3"
then
std::valarray<double> *parameters = 0; pid::parse_multiple_parameters(¶meters, pid::string_to_double(), s);will give you new
std::valarray<double>
with
size 3, containing 1.0, 2.0, and 3.0. In this example we use
pre-defined functor string_to_double
, other
functors are also avilable: string_to_string
and string_to_int
.
Vector Stream provides the following command-line tools. These commands print out help message if no arguments are given.
Command vectorize reads text file and writes it in Vector Stream file format to the standard output.
Command vcat reads Vector Stream file and writes it in Vector Stream file format to the standard output. If -u option is given, the command writes in plain text format.
Command slice -ooffset -llength -sstride input.v slices input vector input.v with offset offset, length length, and stride stride. If length is 0, the length is automatically calculated.
Command gslice -ooffset -Llength1:length2[:...] -Sstride1:stride2[:...] input.v slices input vector input.v with offset offset, lengths length1, lenght2, ..., and strides stride1, stride2, .... If length1 is 0, the length is automatically calculated.
Command splice input1.v input2.v splices two input vectors. The first element of input1.v is first copied to output, and then the first element of input2.v is copied to output. Next the second element of input1.v is copied to the output, then the second element of input2.v is copied... You can give -sstride option to specify stride.
Command add input1.v input2.v calculates sum of two vectors input1.v and input2.v and outputs to the standard out.
Command statistics reports maximum value, minimum value, average value, etc. of input vectors.
Vector Stream is distributed as a source code, thus you must compile the library/tools by yourself after you obtain the source code.
Visit sourceforge.net to download the source code. The source codes are under the control of SVN.
Vector Stream library provides configure script. To install library and supporting tools, try: ./configure, then make, and then make install.
You will be able to contact the author at kanaya (at) users (dot) sourceforge (dot) net.
To be written.
This program/library has been developed under the support of: