textproc/p5-String-Tokenizer/pkg-descr


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

A simple string tokenizer which takes a string and splits it on 
whitespace. It also optionally takes a string of characters to use as 
delimiters, and returns them with the token set as well. This allows for 
splitting the string in many different ways.

This is a very basic tokenizer, so more complex needs should be either 
addressed with a custom written tokenizer or post-processing of the output 
generated by this module. Basically, this will not fill everyones needs, 
but it spans a gap between simple split / /, $string and the other options 
that involve much larger and complex modules.

Also note that this is not a lexical analyser. Many people confuse 
tokenization with lexical analysis. A tokenizer mearly splits its input 
into specific chunks, a lexical analyzer classifies those chunks. 
Sometimes these two steps are combined, but not here.

WWW: http://search.cpan.org/dist/String-Tokenizer/
Author: stevan little <stevan@iinteractive.com>