summary |
shortlog | log |
commit |
commitdiff |
tree
first ⋅ prev ⋅ next
Alexey Borzenkov [Wed, 24 Jun 2009 11:20:21 +0000]
Fix compilation with cygwin 1.5
Cygwin 1.5 doesn't define wcsxfrm, and Cygwin 1.7 implements it
as wcslcpy, so under cygwin it should be safe to define wcsxfrm as
wcslcpy.
Alexey Borzenkov [Tue, 14 Apr 2009 08:50:54 +0000]
Fix compilation under mingw
Mingw does not support visibility("hidden") and aborts
Nikolai Weibull [Mon, 30 Mar 2009 21:15:38 +0000]
.gitignore: Ignore *.bundle for Mac OS X
Nikolai Weibull [Mon, 30 Mar 2009 21:15:25 +0000]
Fix compilation directives
Nikolai Weibull [Mon, 30 Mar 2009 19:40:04 +0000]
Add unified binary search and table lookup code
Nikolai Weibull [Fri, 7 Dec 2007 10:15:33 +0000]
Rakefile: v0.4.1
Nikolai Weibull [Fri, 7 Dec 2007 10:15:10 +0000]
ext/encoding/character/utf-8/private.h: Fix HIDDEN macro for non-GCC.
Nikolai Weibull [Tue, 4 Dec 2007 20:13:48 +0000]
Time for v0.4.0
Nikolai Weibull [Tue, 4 Dec 2007 20:00:21 +0000]
Remove inline attribute from _unichar_combining_class and rename it.
Nikolai Weibull [Thu, 22 Nov 2007 15:34:21 +0000]
Time for version 0.3.0.
Nikolai Weibull [Thu, 22 Nov 2007 15:33:07 +0000]
Fix edge case in #reverse.
Nikolai Weibull [Thu, 22 Nov 2007 15:25:44 +0000]
Fix minor bug in handling of Bignums.
This fixes
[#7037] u("10000").to_i -> 5433328 if (bit_length > sizeof(VALUE)
* CHAR_BIT)
Thanks go out to Hannes Wyss for reporting this and providing a patch.
Nikolai Weibull [Wed, 1 Aug 2007 14:04:47 +0000]
Simplify combine() somewhat.
Nikolai Weibull [Wed, 1 Aug 2007 14:00:29 +0000]
Simplify lookup_compose() slightly.
Nikolai Weibull [Wed, 1 Aug 2007 13:57:10 +0000]
Simplify the test in decompose_hangul() for slightly more readable code.
Nikolai Weibull [Wed, 1 Aug 2007 13:50:07 +0000]
Use newly defined split Unicode-table lookup functions in more places.
Nikolai Weibull [Wed, 1 Aug 2007 13:42:17 +0000]
Remove old comment.
Nikolai Weibull [Wed, 1 Aug 2007 13:41:01 +0000]
Refactor the split Unicode-table lookup routines.
Nikolai Weibull [Mon, 30 Jul 2007 19:12:17 +0000]
Remove left-over break.
Nikolai Weibull [Mon, 30 Jul 2007 14:31:15 +0000]
Simplify calling-sequence for decompose_hangul().
We might as well return the length of the decomposed character instead of
storing it in a parameter. This simplifies its use in
normalize_wc_decompose() as we don’t need to introduce a temporary.
Nikolai Weibull [Mon, 30 Jul 2007 13:16:00 +0000]
Clean up utf_downcase().
The old utf_downcase() was quite complex and hard to grasp. This cleans
it up a bit by factoring out the special cases somewhat, along with some
duplicated-code removal.
Nikolai Weibull [Mon, 30 Jul 2007 09:21:46 +0000]
Don’t calle setlocale() with an argument.
We shouldn’t be overriding the user’s locale.
Nikolai Weibull [Fri, 27 Jul 2007 14:22:23 +0000]
Factor out binary searches.
Binary search was used in a couple of places to look up values in tables.
This patch fators out that code, making the calling code easier to follow
and makes sure that any binary-search-related bugs are caught at one point
in the code.
Nikolai Weibull [Fri, 27 Jul 2007 12:02:00 +0000]
Fix corner-case for special-cased characters.
Nikolai Weibull [Fri, 27 Jul 2007 11:58:44 +0000]
Use OFFSET_IF() in one more place.
Using the OFFSET_IF() macro is a lot easier to understand than the whole
(X != NULL) ? X + LEN : NULL.
Nikolai Weibull [Fri, 27 Jul 2007 11:55:02 +0000]
Clarify edge-case for utf_char_*() functions where MAX is 0.
If MAX is 0, there's no possible character to read.
Nikolai Weibull [Fri, 27 Jul 2007 11:47:03 +0000]
Don’t mark _unichar_combining_class() as inline.
This function is, as it is currently defined, not eligible for inlining.
Nikolai Weibull [Fri, 27 Jul 2007 11:46:01 +0000]
Add inlining flags and warnings for GCC.
Nikolai Weibull [Fri, 27 Jul 2007 11:45:21 +0000]
Add tests for casing methods.
Nikolai Weibull [Sun, 30 Jul 2006 20:14:55 +0000]
Update the position-calculating code
Saying that the position is the difference between the point we’re at and
the beginning of the string using pointer subtraction is only correct if
we’re talking bytes. Use utf_pointer_to_offset() instead to give the
character offset.
Nikolai Weibull [Sat, 29 Jul 2006 14:49:33 +0000]
Add #define for UNICODE_FIRST_CHAR_PART2 to character-tables.h
Also, make sure that this #define is used.
Nikolai Weibull [Sat, 29 Jul 2006 14:37:46 +0000]
Factor out decomposition-to-unichar-array code.
Nikolai Weibull [Fri, 28 Jul 2006 12:29:23 +0000]
Simplify normalization code a bit.
We don’t have to shift down all characters all the time, simply make sure
to copy the ones we want into the slots that exist.
Thanks go out to Lugovoi Nikolai for the suggestion and sample
implementation of this patch.
Nikolai Weibull [Thu, 27 Jul 2006 21:03:43 +0000]
Fix silly remnant from change to normalization code.
The ‘swapped’ parameter was recently added to
unicode_canonical_ordering_swap(), but it was still pretending to return
whether something was swapped or not. Simply turn the function into
void-returning one.
Nikolai Weibull [Thu, 27 Jul 2006 21:00:32 +0000]
Fix typo in PackageFiles definition
The specifications wouldn’t be distributed with the package.
Nikolai Weibull [Thu, 27 Jul 2006 20:59:20 +0000]
Make sure that the “tests/data” directory exists
This makes sure that the “tests/data” directory exists before looking for a
file in that directory.
Nikolai Weibull [Thu, 27 Jul 2006 20:51:15 +0000]
Add normalization methods
It’s time to add normalization methods. It’s basically String#normalize
that takes an optional parameter to specify what kind of normalization is
desired (:nfc, :nfd, :nfkc, :nfkd). Also add test cases for this along
with test cases for String#folcase.
Nikolai Weibull [Thu, 27 Jul 2006 20:20:43 +0000]
Cleanup extconf.rb for utf-8 directory
Nikolai Weibull [Thu, 27 Jul 2006 20:15:07 +0000]
Update normalization code to adhere to PR #29.
According to PR #29 [1] there’s a mistake in the Unicode standard. This
patch updates the normalization code to adhere to the updated rules of PR
[1] http://www.unicode.org/review/pr-29.html
Nikolai Weibull [Thu, 27 Jul 2006 17:13:30 +0000]
Upcasing problem due to stupid boolean reversal.
Upper is when it isn’t lower. It should really not be called upper, but
rather not_lower.
Nikolai Weibull [Thu, 27 Jul 2006 15:23:54 +0000]
Update Unicode-data script to take arguments
The Unicode-data-generating script, generate-unicode-data.rb had hardcoded
the version of Unicode and the location of the input files. Take these as
arguments instead.
Also, along with this, update the generated files so that the correct
version is listed.
Nikolai Weibull [Wed, 26 Jul 2006 13:19:47 +0000]
Fix String#upcase in lithuanian locale.
The problem was that accents were being doubled for ‘i’s in the lithuanian
locale. The reason for this was that the pointer into the string being
upcased wasn’t being advanced properly when output the correct combining
characters, so previous combining characters (from the lowercase version)
would linger. The fix was to make sure that the pointer into the string
was advanced properly.
This problem was spotted by Lugovoi Nikolai.
Nikolai Weibull [Wed, 26 Jul 2006 09:52:29 +0000]
Fix two signedness warnings produced by GCC 4.1.
Nikolai Weibull [Wed, 26 Jul 2006 09:37:06 +0000]
Fix bug in String#[] when passing it a Regexp.
The problem was simple, a parenthesis had been moved to the end of the
expression; silly me. Thanks go to Lugovoi Nikolai for reporting this and
providing a fix.
Nikolai Weibull [Mon, 24 Jul 2006 23:08:12 +0000]
lib/encoding/character/utf-8.rb: Remove .so extension from require line.
Not all platforms use .so as a library extension, so just let Ruby figure
it out for itself.
Nikolai Weibull [Mon, 24 Jul 2006 23:07:05 +0000]
Fix bug in upcasing code that would make upcasing in Lithuanian not work.
Nikolai Weibull [Mon, 24 Jul 2006 12:33:25 +0000]
Remove tags file that was inadverently included in repository.
Nikolai Weibull [Mon, 24 Jul 2006 12:26:44 +0000]
Make array of word strings const to save some relocation work.
As suggested in [1], use const for the array as well.
(Also, move declaration of Init_utf8() to right before the function itself,
as it’s only there to shut upp the compiler about an undeclared extern.)
[1] Ulrich Drepper, “How To Write Shared Libraries”,
http://people.redhat.com/drepper/dsohowto.pdf.
Nikolai Weibull [Mon, 24 Jul 2006 11:12:37 +0000]
Remove old Unicode data files (4.1).
These files should probably not have been in the repository at all.
By the way, I forgot to credit Lugovoi Nikolai for finding the last two
bugs (#length and #upcase/#downcase). Sorry about that.
Nikolai Weibull [Mon, 24 Jul 2006 11:10:08 +0000]
Prevent utf_upcase() and utf_downcase() from casing uncasable characters.
Not all lowercase characters have an uppercase equivalent, and vice versa.
Add a check to make sure that we don’t try to do it anyway.
Nikolai Weibull [Mon, 24 Jul 2006 10:09:14 +0000]
Add a specification for the #length method.
To make sure that we keep to the specification of #length, add a
specification for it.
Nikolai Weibull [Mon, 24 Jul 2006 10:08:27 +0000]
Updated utf_length_n() to count NUL bytes as normal characters.
Previously utf_length_n() would stop at the first NUL byte, even if ‘len’
said that more data was available. This conforms with the semantics of,
for example, strnlen(), but as we in Ruby can easily get Strings that
contain any byte, we can’t treat NUL bytes separately in the case where
‘len’ is provided (as opposed to utf_length() that is). So all *_n()
functions will have to be rewritten to handle NUL bytes like any other
byte. We begin with the utf_length_n() function, because it is an obvious
case where this was causing a problem.
Nikolai Weibull [Sun, 23 Jul 2006 22:26:17 +0000]
Rakefile: Simplify TAGS target.
Nikolai Weibull [Sun, 23 Jul 2006 21:50:39 +0000]
Rakefile: Add clean and clobber tasks and fix some small bugs.
The Makefiles won’t always exist, so fake a list of sources that is
probably going to be right if one can’t be extracted from the Makefile.
Nikolai Weibull [Sun, 23 Jul 2006 21:26:16 +0000]
Rakefile: Add tasks to build, install, and uninstall a gem.
Nikolai Weibull [Sun, 23 Jul 2006 20:53:43 +0000]
Indentation fix.
One can still tell what parts originally come from string.c :-).
Nikolai Weibull [Sun, 23 Jul 2006 17:32:17 +0000]
Remove strstr, strnstr, and strrnstr from public interface.
These functions should never have been exported. Also, their inclusion is
sometimes not necessary so they should really be wrapped up neatly in some
Anyway…
Nikolai Weibull [Sun, 23 Jul 2006 16:23:06 +0000]
Initial commit.