601 lines
20 KiB
Plaintext
601 lines
20 KiB
Plaintext
=pod
|
|
|
|
=head1 NAME
|
|
|
|
Locale::Codes - a distribution of modules to handle locale codes
|
|
|
|
=head1 DESCRIPTION
|
|
|
|
B<Locale-Codes> is a distribution containing a set of modules designed
|
|
to work with sets of codes which uniquely identify something. For
|
|
example, there are codes associated with different countries, different
|
|
currencies, different languages, etc. These sets of codes are typically
|
|
maintained in some standard.
|
|
|
|
This distribution provides a way to work with these lists of codes.
|
|
Because the data from the various standards is not available in any
|
|
sort of consistent API, access to the lists is not available in any
|
|
direct fashion. To compensate for this, the list of codes is stored
|
|
internally within this distribution, and the distribution is updated
|
|
on a regular basis to include all known codes at that point in time.
|
|
This does mean that it is necessary to keep this distribution
|
|
up-to-date to keep up with the various changes that are made in the
|
|
various standards.
|
|
|
|
Traditionally, a module has been created to work with each type of
|
|
code sets. So, there is a module for working with country lists, one
|
|
for currency lists, etc. Since version 3.00, all of these individual
|
|
modules were written as wrappers around a central module (which was not
|
|
intended to be used directly) which did all of the real work.
|
|
|
|
Starting with version 3.50, the central module was reworked slightly
|
|
to provide an object-oriented interface. All of the modules for
|
|
working with individual types of code sets were reworked to use the
|
|
improved OO module, so the traditional interfaces still work as they
|
|
always have. As a result, you are free to use the traditional
|
|
functional (non-OO) interfaces, or to use the OO interface and bypass
|
|
the wrapper modules entirely.
|
|
|
|
Both methods will be supported in the future, so use the one that is
|
|
best suited to your needs.
|
|
|
|
Within each type, any number of code sets are allowed. For example,
|
|
sets of country codes are maintained in several different locations
|
|
including the ISO-3166 standard, the IANA, and by the United Nations.
|
|
The lists of countries are similar, but not identical. Multiple code
|
|
sets are supported, though trying to convert from one code set to
|
|
another will not always work since the list of countries is not
|
|
one-to-one.
|
|
|
|
All data in all of these modules comes directly from the original
|
|
standards (or as close to direct as possible), so it should be
|
|
up-to-date at the time of release.
|
|
|
|
I plan on releasing a new version several times a year to incorporate
|
|
any changes made in the standards. However, I don't always know about
|
|
changes that occur, so if any of the standards change, and you want a
|
|
new release sooner, just email me and I'll get one out.
|
|
|
|
=head1 SYNOPSIS (OBJECT-ORIENTED INTERFACE)
|
|
|
|
use Locale::Codes;
|
|
or
|
|
use Locale::Codes ':constants';
|
|
|
|
$obj = new Locale::Codes 'country';
|
|
|
|
=head1 OBJECT-ORIENTED METHODS
|
|
|
|
The following methods are available.
|
|
|
|
In all methods, when specifying a code set, the name (as a string)
|
|
is always available.
|
|
|
|
Traditionally, you could also use a perl constant to specify the
|
|
code set. In order to do so with the OO interface, you have to
|
|
import the constants. To do that, load the module with:
|
|
|
|
use Locale::Codes ':constants';
|
|
|
|
=over 4
|
|
|
|
=item B<new ( [TYPE [,CODESET]] )>
|
|
|
|
$obj = new Locale::Codes;
|
|
$obj = new Locale::Codes 'country';
|
|
$obj = new Locale::Codes 'country','alpha-3';
|
|
$obj = new Locale::Codes 'country',LOCALE_COUNTRY_ALPHA_3;
|
|
|
|
This creates a new object that can access the data. If no type is specified
|
|
(in the first argument), you must use the B<type> method described below.
|
|
No operations will work unless the type is specified.
|
|
|
|
The second argument is the default code set to use. This is optional, as
|
|
each type has a default code set. The default code set can be set using
|
|
the B<codeset> method below.
|
|
|
|
The last example is only available if the constants were imported when
|
|
the module was loaded.
|
|
|
|
=item B<show_errors ( FLAG )>
|
|
|
|
$obj->show_errors(1);
|
|
$obj->show_errors(0);
|
|
|
|
By default, error messages will be produced when bad data is passed
|
|
to any method. By passing in '0', these will be turned off so that
|
|
all failures will be silent.
|
|
|
|
=item B<type ( TYPE )>
|
|
|
|
$obj->type($type)
|
|
|
|
This will set the type of codes that will be worked with. C<$type> may
|
|
be any of the recognized types of code sets, including:
|
|
|
|
country
|
|
language
|
|
currency
|
|
script
|
|
etc.
|
|
|
|
The list of valid types, and the code sets supported in each, are described
|
|
in the L<Locale::Codes::Types> document.
|
|
|
|
This method can be called any number of times to toggle between different types
|
|
of code sets.
|
|
|
|
=item B<codeset ( CODESET )>
|
|
|
|
$obj->codeset($codeset);
|
|
|
|
This sets the default code set to use. The list of code sets available
|
|
for each type are described in the L<Locale::Codes::Types> document.
|
|
|
|
In all other methods below, when an optional B<CODESET> argument is
|
|
omitted, it will default to this value.
|
|
|
|
=item B<code2name ( CODE [,CODESET] [,'retired'] )>
|
|
|
|
$name = $obj->code2name($code [,$codeset] [,'retired']);
|
|
|
|
This functions take a code and returns a string which contains
|
|
the name of the element identified. If the code is not a valid
|
|
code in the B<CODESET> specified then C<undef> will be returned.
|
|
|
|
The name of the element is the name as specified in the standard,
|
|
and as a result, different variations of an element name may
|
|
be returned for different values of B<CODESET>.
|
|
|
|
For example, the alpha-2 country code set defines the two-letter
|
|
code "bo" to be "Bolivia, Plurinational State of", whereas the
|
|
alpha-3 code set defines the code 'bol' to be the country "Bolivia
|
|
(Plurinational State of)". So:
|
|
|
|
$obj->code2name('bo','alpha-2');
|
|
=> 'Bolivia, Plurinational State of'
|
|
|
|
$obj->code2name('bol','alpha-3');
|
|
=> 'Bolivia (Plurinational State of)'
|
|
|
|
By default, only active codes will be used, but if the string
|
|
'retired' is passed in as an argument, both active and retired
|
|
codes will be examined.
|
|
|
|
=item B<name2code ( NAME [,CODESET] [,'retired'] )>
|
|
|
|
$code = $obj->name2code($name [,$codeset] [,'retired']);
|
|
|
|
This function takes the name of an element (or any of it's aliases)
|
|
and returns the code that corresponds to it, if it exists. If B<NAME>
|
|
could not be identified as the name of one of the elements, then
|
|
C<undef> will be returned.
|
|
|
|
The name is not case sensitive. Also, any known variation of a name
|
|
may be passed in.
|
|
|
|
For example, even though the country name returned using 'alpha-2'
|
|
and 'alpha-3' country codes for Bolivia are different, either country
|
|
name may be passed in since for each code set (in addition to the more
|
|
common alias 'Bolivia'). So:
|
|
|
|
$obj->name2code('Bolivia, Plurinational State of','alpha-2');
|
|
=> bo
|
|
|
|
$obj->name2code('Bolivia (Plurinational State of)','alpha-2');
|
|
=> bo
|
|
|
|
$obj->name2code('Bolivia','alpha-2');
|
|
=> bo
|
|
|
|
By default, only active names will be used, but if the string
|
|
'retired' is passed in as an argument, both active and retired
|
|
names will be examined.
|
|
|
|
=item B<code2code ( CODE [,CODESET] ,CODESET2 )>
|
|
|
|
$code = $obj->code2code($code [,$codeset] ,$codeset2);
|
|
|
|
This function takes a code from one code set (B<CODESET> or the
|
|
default code set), and returns the corresponding code from another
|
|
code set (B<CODESET2>). B<CODE> must exists in the code set specified
|
|
by B<CODESET> and must have a corresponding code in the
|
|
code set specified by B<CODESET2> or C<undef> will be returned.
|
|
|
|
$obj->code2code('fin','alpha-3','alpha-2');
|
|
=> 'fi'
|
|
|
|
Note that this function does NOT support retired codes.
|
|
|
|
=item B<all_codes ( [CODESET] [,'retired'] )>
|
|
|
|
@code = $obj->all_codes([$codeset] [,'retired']);
|
|
|
|
This returns a list of all code in the code set. The codes will be
|
|
sorted.
|
|
|
|
By default, only active codes will be returned, but if the string
|
|
'retired' is passed in as an argument, both active and retired
|
|
codes will be returned.
|
|
|
|
=item B<all_names ( [CODESET] [,'retired'] )>
|
|
|
|
@name = $obj->all_names([$codeset] [,'retired']);
|
|
|
|
This method returns a list of all elements names for which there is a
|
|
corresponding code in the specified code set.
|
|
|
|
The names returned are exactly as they are specified in the standard,
|
|
and are sorted.
|
|
|
|
Since not all elements are listed in all code sets, the list of
|
|
elements may differ depending on the code set specified.
|
|
|
|
By default, only active names will be returned, but if the string
|
|
'retired' is passed in as an argument, both active and retired
|
|
names will be returned.
|
|
|
|
=back
|
|
|
|
The following additional methods are available and can be used to
|
|
modify the code list data (and are therefore not generally useful).
|
|
|
|
=over 4
|
|
|
|
=item B<rename_code ( CODE ,NEW_NAME [,CODESET] )>
|
|
|
|
$flag = $obj->rename_code($code,$new_name [,$codeset]);
|
|
|
|
This method can be used to change the official name of an element. At
|
|
that point, the name returned by the C<code2name> method would be
|
|
B<NEW_NAME> instead of the name specified in the standard.
|
|
|
|
The original name will remain as an alias.
|
|
|
|
For example, the official country name for code 'gb' is 'United
|
|
Kingdom'. If you want to change that, you might call:
|
|
|
|
$obj->rename_code('gb', 'Great Britain');
|
|
|
|
This means that calling code2name('gb') will now return 'Great
|
|
Britain' instead of 'United Kingdom'.
|
|
|
|
If any error occurs, a warning is issued and 0 is returned. An error
|
|
occurs if B<CODE> doesn't exist in the specified code set, or if
|
|
B<NEW_NAME> is already in use but for a different element.
|
|
|
|
If the method succeeds, 1 is returned.
|
|
|
|
=item B<add_code ( CODE ,NAME [,CODESET] )>
|
|
|
|
$flag = $obj->add_code($code,$name [,$codeset]);
|
|
|
|
This method is used to add a new code and name to the data.
|
|
|
|
Both B<CODE> and B<NAME> must be unused in the data set or an error
|
|
occurs (though B<NAME> may be used in a different data set).
|
|
|
|
For example, to create the fictitious country named "Duchy of
|
|
Grand Fenwick" with codes "gf" and "fen", use the following:
|
|
|
|
$obj->add_code("fe","Duchy of Grand Fenwick",'alpha-2');
|
|
$obj->add_code("fen","Duchy of Grand Fenwick",'alpha-3');
|
|
|
|
The return value is 1 on success, 0 on an error.
|
|
|
|
=item B<delete_code ( CODE [,CODESET] )>
|
|
|
|
$flag = $obj->delete_code($code [,$codeset]);
|
|
|
|
This method is used to delete a code from the data.
|
|
|
|
B<CODE> must refer to an existing code in the code set.
|
|
|
|
The return value is 1 on success, 0 on an error.
|
|
|
|
=item B<add_alias ( NAME ,NEW_NAME )>
|
|
|
|
$flag = $obj->add_alias($name,$new_name);
|
|
|
|
This method is used to add a new alias to the data. They do
|
|
not alter the return value of the C<code2name> function.
|
|
|
|
B<NAME> must be an existing element name, and B<NEW_NAME> must
|
|
be unused or an error occurs.
|
|
|
|
The return value is 1 on success, 0 on an error.
|
|
|
|
=item B<delete_alias ( NAME )>
|
|
|
|
$flag = $obj->delete_alias($name);
|
|
|
|
This method is used to delete an alias from the data. Once
|
|
removed, the element may not be referred to by B<NAME>.
|
|
|
|
B<NAME> must be one of a list of at least two names that may be used to
|
|
specify an element. If the element may only be referred to by a single
|
|
name, you'll need to use the C<add_alias> method to add a new alias
|
|
first, or the C<remove_code> method to remove the element entirely.
|
|
|
|
If the alias is used as the name in any code set, one of the other
|
|
names will be used instead. Predicting exactly which one will
|
|
be used requires you to know the order in which the standards
|
|
were read, which is not reliable, so you may want to use the
|
|
C<rename_code> method to force one of the alternate names to be
|
|
used.
|
|
|
|
The return value is 1 on success, 0 on an error.
|
|
|
|
=item B<replace_code ( CODE ,NEW_CODE [,CODESET] )>
|
|
|
|
$flag = $obj->replace_code($code,$new_code [,$codeset]);
|
|
|
|
This method is used to change the official code for an element. At
|
|
that point, the code returned by the C<name2code> method would be
|
|
B<NEW_CODE> instead of the code specified in the standard.
|
|
|
|
B<NEW_CODE> may either be a code that is not in use, or it may be an
|
|
alias for B<CODE> (in which case, B<CODE> becomes and alias and B<NEW_CODE>
|
|
becomes the "real" code).
|
|
|
|
The original code is kept as an alias, so that the C<code2name> routines
|
|
will work with either the code from the standard or the new code.
|
|
|
|
However, the C<all_codes> method will only return the codes which
|
|
are considered "real" (which means that the list of codes will now
|
|
contain B<NEW_CODE>, but will not contain B<CODE>).
|
|
|
|
=item B<add_code_alias ( CODE ,NEW_CODE [,CODESET] )>
|
|
|
|
$flag = $obj->add_code_alias($code,$new_code [,$codeset]);
|
|
|
|
This method adds an alias for the code. At that point, B<NEW_CODE> and B<CODE>
|
|
will both work in the C<code2name> method. However, the C<name2code> method will
|
|
still return the original code.
|
|
|
|
=item B<delete_code_alias ( CODE [,CODESET] )>
|
|
|
|
These routines delete an alias for the code.
|
|
|
|
These will only work if B<CODE> is actually an alias. If it is the "real"
|
|
code, it will not be deleted. You will need to use the C<rename_code>
|
|
method to switch the real code with one of the aliases, and then
|
|
delete the alias.
|
|
|
|
=back
|
|
|
|
=head1 TRADITIONAL INTERFACES
|
|
|
|
In addition the the primary OO module, the following modules are included in
|
|
the distribution for the traditional way of working with code sets.
|
|
|
|
Each module will work with one specific type of code sets.
|
|
|
|
=over 4
|
|
|
|
=item L<Locale::Codes::Country>, L<Locale::Country>
|
|
|
|
This includes support for country codes (such as those listed in ISO-3166)
|
|
to specify the country.
|
|
|
|
Because this module was originally distributed as L<Locale::Country>, it is
|
|
also available under that name.
|
|
|
|
=item L<Locale::Codes::Language>, L<Locale::Language>
|
|
|
|
This includes support for language codes (such as those listed in ISO-639)
|
|
to specify the language.
|
|
|
|
Because this module was originally distributed as L<Locale::Language>, it is
|
|
also available under that name.
|
|
|
|
=item L<Locale::Codes::Currency>, L<Locale::Currency>
|
|
|
|
This includes support for currency codes (such as those listed in ISO-4217)
|
|
to specify the currency.
|
|
|
|
Because this module was originally distributed as L<Locale::Currency>, it is
|
|
also available under that name.
|
|
|
|
=item L<Locale::Codes::Script>, L<Locale::Script>
|
|
|
|
This includes support for script codes (such as those listed in ISO-15924)
|
|
to specify the script.
|
|
|
|
Because this module was originally distributed as L<Locale::Script>, it is
|
|
also available under that name.
|
|
|
|
=item L<Locale::Codes::LangExt>
|
|
|
|
This includes support for language extension codes (such as those listed
|
|
in the IANA language registry) to specify the language extension.
|
|
|
|
=item L<Locale::Codes::LangVar>
|
|
|
|
This includes support for language variation codes (such as those listed
|
|
in the IANA language registry) to specify the language variation.
|
|
|
|
=item L<Locale::Codes::LangFam>
|
|
|
|
This includes support for language family codes (such as those listed
|
|
in ISO 639-5) to specify families of languages.
|
|
|
|
=back
|
|
|
|
In addition to the modules above, there are a number of support modules included
|
|
in the distribution. Any module not listed above falls into that category.
|
|
|
|
These modules are not intended to be used by programmers. They contain functions
|
|
or data that are used by the modules listed above. No support of any kind is
|
|
offered for using these modules directly. They may be modified at any time.
|
|
|
|
=head1 COMMON ALIASES
|
|
|
|
As of version 2.00, the modules supported common variants of names.
|
|
|
|
For example, Locale::Country supports variant names for countries, and
|
|
a few of the most common ones are included in the data. The country
|
|
code for "United States" is "us", so:
|
|
|
|
country2code('United States');
|
|
=> "us"
|
|
|
|
Now the following will also return 'us':
|
|
|
|
country2code('United States of America');
|
|
country2code('USA');
|
|
|
|
Any number of common aliases may be included in the data, in addition
|
|
to the names that come directly from the standards. If you have a
|
|
common alias for a country, language, or any other of the types of
|
|
codes, let me know and I'll add it, with some restrictions.
|
|
|
|
For example, the country name "North Korea" never appeared in any of
|
|
the official sources (instead, it was "Korea, North" or "Korea,
|
|
Democratic People's Republic of". I would honor a request to add an
|
|
alias "North Korea" since that's a very common way to specify the
|
|
country (please don't request this... I've already added it).
|
|
|
|
On the other hand, a request to add Zaire as an alias for "Congo, The
|
|
Democratic Republic of" will not be honored. The country's official
|
|
name is no longer Zaire, so adding it as an alias violates the
|
|
standard. Zaire was kept as an alias in versions of this module prior
|
|
to 3.00, but it has been removed. Other aliases (if any) which no
|
|
longer appear in any standard (and which are not common variations of
|
|
the name in the standards) have also been removed.
|
|
|
|
=head1 RETIRED CODES
|
|
|
|
Occasionally, a code is deprecated, but it may still be desirable to
|
|
have access to it.
|
|
|
|
Although there is no way to see every code that has ever existed and
|
|
been deprecated (since most codesets do not have that information
|
|
available), as of version 3.20, every code which has ever been included
|
|
in these modules can be referenced.
|
|
|
|
For more information, refer to the documentation on the code2name, name2code,
|
|
all_codes, and all_names methods above.
|
|
|
|
=head1 SEE ALSO
|
|
|
|
=over 4
|
|
|
|
=item L<Locale::Codes::Types>
|
|
|
|
The list of all code sets available for each type.
|
|
|
|
=item L<Locale::Codes::Changes>
|
|
|
|
A history of changes made to this distribution.
|
|
|
|
=back
|
|
|
|
=head1 KNOWN BUGS AND LIMITATIONS
|
|
|
|
=over 4
|
|
|
|
=item B<Relationship between code sets>
|
|
|
|
Because each code set uses a slightly different list of elements, and
|
|
they are not necessarily one-to-one, there may be some confusion
|
|
about the relationship between codes from different code sets.
|
|
|
|
For example, ISO 3166 assigns one code to the country "United States
|
|
Minor Outlying Islands", but the IANA codes give different codes
|
|
to different islands (Baker Island, Howland Island, etc.).
|
|
|
|
This may cause some confusion... I've done the best that I could do
|
|
to minimize it.
|
|
|
|
=item B<Non-ASCII characters not supported>
|
|
|
|
Currently all names must be all ASCII. I plan on relaxing that
|
|
limitation in the future.
|
|
|
|
=back
|
|
|
|
=head1 BUGS AND QUESTIONS
|
|
|
|
If you find a bug in Locale::Codes, there are three ways to send it to me.
|
|
Any of them are fine, so use the method that is easiest for you.
|
|
|
|
=over 4
|
|
|
|
=item Direct email
|
|
|
|
You are welcome to send it directly to me by email. The email address
|
|
to use is: sbeck@cpan.org.
|
|
|
|
=item CPAN Bug Tracking
|
|
|
|
You can submit it using the CPAN tracking too. This can be done at the
|
|
following URL:
|
|
|
|
L<http://rt.cpan.org/Public/Dist/Display.html?Name=Locale-Codes>
|
|
|
|
=item GitHub
|
|
|
|
You can submit it as an issue on GitHub. This can be done at the following
|
|
URL:
|
|
|
|
L<https://github.com/SBECK-github/Locale-Codes>
|
|
|
|
=back
|
|
|
|
Please do not use other means to report bugs (such as forums for a specific
|
|
OS or Linux distribution) as it is impossible for me to keep up with all of
|
|
them.
|
|
|
|
When filing a bug report, please include the following information:
|
|
|
|
=over 4
|
|
|
|
=item B<Locale::Codes version>
|
|
|
|
Please include the version of Locale::Codes you are using. You can get
|
|
this by using the script:
|
|
|
|
use Locale::Codes;
|
|
print $Locale::Codes::VERSION,"\n";
|
|
|
|
=back
|
|
|
|
If you want to report missing or incorrect codes, you must be running the
|
|
most recent version of Locale::Codes.
|
|
|
|
If you find any problems with the documentation (errors, typos, or items
|
|
that are not clear), please send them to me. I welcome any suggestions
|
|
that will allow me to improve the documentation.
|
|
|
|
=head1 AUTHOR
|
|
|
|
Locale::Country and Locale::Language were originally written by Neil
|
|
Bowers at the Canon Research Centre Europe (CRE). They maintained the
|
|
distribution from 1997 to 2001.
|
|
|
|
Locale::Currency was originally written by Michael Hennecke and was
|
|
modified by Neil Bowers for inclusion in the distribution.
|
|
|
|
From 2001 to 2004, maintenance was continued by Neil Bowers. He
|
|
modified Locale::Currency for inclusion in the distribution. He also
|
|
added Locale::Script.
|
|
|
|
From 2004-2009, the module was unmaintained.
|
|
|
|
In 2010, maintenance was taken over by Sullivan Beck (sbeck@cpan.org)
|
|
with Neil Bower's permission. All problems or comments should be
|
|
sent to him using any of the methods listed above.
|
|
|
|
=head1 COPYRIGHT
|
|
|
|
Copyright (c) 1997-2001 Canon Research Centre Europe (CRE).
|
|
Copyright (c) 2001 Michael Hennecke (Locale::Currency)
|
|
Copyright (c) 2001-2010 Neil Bowers
|
|
Copyright (c) 2010-2018 Sullivan Beck
|
|
|
|
This module is free software; you can redistribute it and/or
|
|
modify it under the same terms as Perl itself.
|
|
|
|
=cut
|