• Skip to content
  • Skip to link menu
Trinity API Reference
  • Trinity API Reference
  • tdespell2
 

tdespell2

  • tdespell2
  • plugins
  • ispell
correct.cpp
1/* enchant
2 * Copyright (C) 2003 Dom Lachowicz
3 *
4 * This library is free software; you can redistribute it and/or
5 * modify it under the terms of the GNU Lesser General Public
6 * License as published by the Free Software Foundation; either
7 * version 2.1 of the License, or (at your option) any later version.
8 *
9 * This library is distributed in the hope that it will be useful,
10 * but WITHOUT ANY WARRANTY; without even the implied warranty of
11 * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
12 * Lesser General Public License for more details.
13 *
14 * You should have received a copy of the GNU Lesser General Public
15 * License along with this library; if not, write to the
16 * Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor,
17 * Boston, MA 02110-1301, USA.
18 *
19 * In addition, as a special exception, Dom Lachowicz
20 * gives permission to link the code of this program with
21 * non-LGPL Spelling Provider libraries (eg: a MSFT Office
22 * spell checker backend) and distribute linked combinations including
23 * the two. You must obey the GNU Lesser General Public License in all
24 * respects for all of the code used other than said providers. If you modify
25 * this file, you may extend this exception to your version of the
26 * file, but you are not obligated to do so. If you do not wish to
27 * do so, delete this exception statement from your version.
28 */
29
30/*
31 * correct.c - Routines to manage the higher-level aspects of spell-checking
32 *
33 * This code originally resided in ispell.c, but was moved here to keep
34 * file sizes smaller.
35 *
36 * Copyright (c), 1983, by Pace Willisson
37 *
38 * Copyright 1992, 1993, Geoff Kuenning, Granada Hills, CA
39 * All rights reserved.
40 *
41 * Redistribution and use in source and binary forms, with or without
42 * modification, are permitted provided that the following conditions
43 * are met:
44 *
45 * 1. Redistributions of source code must retain the above copyright
46 * notice, this list of conditions and the following disclaimer.
47 * 2. Redistributions in binary form must reproduce the above copyright
48 * notice, this list of conditions and the following disclaimer in the
49 * documentation and/or other materials provided with the distribution.
50 * 3. All modifications to the source code must be clearly marked as
51 * such. Binary redistributions based on modified source code
52 * must be clearly marked as modified versions in the documentation
53 * and/or other materials provided with the distribution.
54 * 4. All advertising materials mentioning features or use of this software
55 * must display the following acknowledgment:
56 * This product includes software developed by Geoff Kuenning and
57 * other unpaid contributors.
58 * 5. The name of Geoff Kuenning may not be used to endorse or promote
59 * products derived from this software without specific prior
60 * written permission.
61 *
62 * THIS SOFTWARE IS PROVIDED BY GEOFF KUENNING AND CONTRIBUTORS ``AS IS'' AND
63 * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
64 * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
65 * ARE DISCLAIMED. IN NO EVENT SHALL GEOFF KUENNING OR CONTRIBUTORS BE LIABLE
66 * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
67 * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
68 * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
69 * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
70 * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
71 * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
72 * SUCH DAMAGE.
73 */
74
75/*
76 * $Log$
77 * Revision 1.1 2004/01/31 16:44:12 zrusin
78 * ISpell plugin.
79 *
80 * Revision 1.4 2003/08/14 17:51:26 dom
81 * update license - exception clause should be Lesser GPL
82 *
83 * Revision 1.3 2003/07/28 20:40:25 dom
84 * fix up the license clause, further win32-registry proof some directory getting functions
85 *
86 * Revision 1.2 2003/07/16 22:52:35 dom
87 * LGPL + exception license
88 *
89 * Revision 1.1 2003/07/15 01:15:04 dom
90 * ispell enchant backend
91 *
92 * Revision 1.2 2003/01/29 05:50:11 hippietrail
93 *
94 * Fixed my mess in EncodingManager.
95 * Changed many C casts to C++ casts.
96 *
97 * Revision 1.1 2003/01/24 05:52:31 hippietrail
98 *
99 * Refactored ispell code. Old ispell global variables had been put into
100 * an allocated structure, a pointer to which was passed to many functions.
101 * I have now made all such functions and variables private members of the
102 * ISpellChecker class. It was C OO, now it's C++ OO.
103 *
104 * I've fixed the makefiles and tested compilation but am unable to test
105 * operation. Please back out my changes if they cause problems which
106 * are not obvious or easy to fix.
107 *
108 * Revision 1.7 2002/09/19 05:31:15 hippietrail
109 *
110 * More Ispell cleanup. Conditional globals and DEREF macros are removed.
111 * K&R function declarations removed, converted to Doxygen style comments
112 * where possible. No code has been changed (I hope). Compiles for me but
113 * unable to test.
114 *
115 * Revision 1.6 2002/09/17 03:03:28 hippietrail
116 *
117 * After seeking permission on the developer list I've reformatted all the
118 * spelling source which seemed to have parts which used 2, 3, 4, and 8
119 * spaces for tabs. It should all look good with our standard 4-space
120 * tabs now.
121 * I've concentrated just on indentation in the actual code. More prettying
122 * could be done.
123 * * NO code changes were made *
124 *
125 * Revision 1.5 2002/09/13 17:20:12 mpritchett
126 * Fix more warnings for Linux build
127 *
128 * Revision 1.4 2002/03/06 08:27:16 fjfranklin
129 * o Only activate compound handling when the hash file says so (Per Larsson)
130 *
131 * Revision 1.3 2001/05/14 09:52:50 hub
132 * Removed newMain.c from GNUmakefile.am
133 *
134 * C++ comments are not C comment. Changed to C comments
135 *
136 * Revision 1.2 2001/05/12 16:05:42 thomasf
137 * Big pseudo changes to ispell to make it pass around a structure rather
138 * than rely on all sorts of gloabals willy nilly here and there. Also
139 * fixed our spelling class to work with accepting suggestions once more.
140 * This code is dirty, gross and ugly (not to mention still not supporting
141 * multiple hash sized just yet) but it works on my machine and will no
142 * doubt break other machines.
143 *
144 * Revision 1.1 2001/04/15 16:01:24 tomas_f
145 * moving to spell/xp
146 *
147 * Revision 1.2 1999/10/05 16:17:28 paul
148 * Fixed build, and other tidyness.
149 * Spell dialog enabled by default, with keyboard binding of F7.
150 *
151 * Revision 1.1 1999/09/29 23:33:32 justin
152 * Updates to the underlying ispell-based code to support suggested corrections.
153 *
154 * Revision 1.59 1995/08/05 23:19:43 geoff
155 * Fix a bug that caused offsets for long lines to be confused if the
156 * line started with a quoting uparrow.
157 *
158 * Revision 1.58 1994/11/02 06:56:00 geoff
159 * Remove the anyword feature, which I've decided is a bad idea.
160 *
161 * Revision 1.57 1994/10/26 05:12:39 geoff
162 * Try boundary characters when inserting or substituting letters, except
163 * (naturally) at word boundaries.
164 *
165 * Revision 1.56 1994/10/25 05:46:30 geoff
166 * Fix an assignment inside a conditional that could generate spurious
167 * warnings (as well as being bad style). Add support for the FF_ANYWORD
168 * option.
169 *
170 * Revision 1.55 1994/09/16 04:48:24 geoff
171 * Don't pass newlines from the input to various other routines, and
172 * don't assume that those routines leave the input unchanged.
173 *
174 * Revision 1.54 1994/09/01 06:06:41 geoff
175 * Change erasechar/killchar to uerasechar/ukillchar to avoid
176 * shared-library problems on HP systems.
177 *
178 * Revision 1.53 1994/08/31 05:58:38 geoff
179 * Add code to handle extremely long lines in -a mode without splitting
180 * words or reporting incorrect offsets.
181 *
182 * Revision 1.52 1994/05/25 04:29:24 geoff
183 * Fix a bug that caused line widths to be calculated incorrectly when
184 * displaying lines containing tabs. Fix a couple of places where
185 * characters were sign-extended incorrectly, which could cause 8-bit
186 * characters to be displayed wrong.
187 *
188 * Revision 1.51 1994/05/17 06:44:05 geoff
189 * Add support for controlled compound formation and the COMPOUNDONLY
190 * option to affix flags.
191 *
192 * Revision 1.50 1994/04/27 05:20:14 geoff
193 * Allow compound words to be formed from more than two components
194 *
195 * Revision 1.49 1994/04/27 01:50:31 geoff
196 * Add support to correctly capitalize words generated as a result of a
197 * missing-space suggestion.
198 *
199 * Revision 1.48 1994/04/03 23:23:02 geoff
200 * Clean up the code in missingspace() to be a bit simpler and more
201 * efficient.
202 *
203 * Revision 1.47 1994/03/15 06:24:23 geoff
204 * Fix the +/-/~ commands to be independent. Allow the + command to
205 * receive a suffix which is a deformatter type (currently hardwired to
206 * be either tex or nroff/troff).
207 *
208 * Revision 1.46 1994/02/21 00:20:03 geoff
209 * Fix some bugs that could cause bad displays in the interaction between
210 * TeX parsing and string characters. Show_char now will not overrun
211 * the inverse-video display area by accident.
212 *
213 * Revision 1.45 1994/02/14 00:34:51 geoff
214 * Fix correct to accept length parameters for ctok and itok, so that it
215 * can pass them to the to/from ichar routines.
216 *
217 * Revision 1.44 1994/01/25 07:11:22 geoff
218 * Get rid of all old RCS log lines in preparation for the 3.1 release.
219 *
220 */
221
222#include <stdlib.h>
223#include <string.h>
224#include <ctype.h>
225#include "ispell_checker.h"
226#include "msgs.h"
227
228/*
229extern void upcase P ((ichar_t * string));
230extern void lowcase P ((ichar_t * string));
231extern ichar_t * strtosichar P ((char * in, int canonical));
232
233int compoundflag = COMPOUND_CONTROLLED;
234*/
235
236/*
237 * \param a
238 * \param b
239 * \param canonical NZ for canonical string chars
240 *
241 * \return
242 */
243int
244ISpellChecker::casecmp (char *a, char *b, int canonical)
245{
246 ichar_t * ap;
247 ichar_t * bp;
248 ichar_t inta[INPUTWORDLEN + 4 * MAXAFFIXLEN + 4];
249 ichar_t intb[INPUTWORDLEN + 4 * MAXAFFIXLEN + 4];
250
251 strtoichar (inta, a, sizeof inta, canonical);
252 strtoichar (intb, b, sizeof intb, canonical);
253 for (ap = inta, bp = intb; *ap != 0; ap++, bp++)
254 {
255 if (*ap != *bp)
256 {
257 if (*bp == '\0')
258 return m_hashheader.sortorder[*ap];
259 else if (mylower (*ap))
260 {
261 if (mylower (*bp) || mytoupper (*ap) != *bp)
262 return static_cast<int>(m_hashheader.sortorder[*ap])
263 - static_cast<int>(m_hashheader.sortorder[*bp]);
264 }
265 else
266 {
267 if (myupper (*bp) || mytolower (*ap) != *bp)
268 return static_cast<int>(m_hashheader.sortorder[*ap])
269 - static_cast<int>(m_hashheader.sortorder[*bp]);
270 }
271 }
272 }
273 if (*bp != '\0')
274 return -static_cast<int>(m_hashheader.sortorder[*bp]);
275 for (ap = inta, bp = intb; *ap; ap++, bp++)
276 {
277 if (*ap != *bp)
278 {
279 return static_cast<int>(m_hashheader.sortorder[*ap])
280 - static_cast<int>(m_hashheader.sortorder[*bp]);
281 }
282 }
283 return 0;
284}
285
286/*
287 * \param word
288 */
289void
290ISpellChecker::makepossibilities (ichar_t *word)
291{
292 int i;
293
294 for (i = 0; i < MAXPOSSIBLE; i++)
295 m_possibilities[i][0] = 0;
296 m_pcount = 0;
297 m_maxposslen = 0;
298 m_easypossibilities = 0;
299
300#ifndef NO_CAPITALIZATION_SUPPORT
301 wrongcapital (word);
302#endif
303
304/*
305 * according to Pollock and Zamora, CACM April 1984 (V. 27, No. 4),
306 * page 363, the correct order for this is:
307 * OMISSION = TRANSPOSITION > INSERTION > SUBSTITUTION
308 * thus, it was exactly backwards in the old version. -- PWP
309 */
310
311 if (m_pcount < MAXPOSSIBLE)
312 missingletter (word); /* omission */
313 if (m_pcount < MAXPOSSIBLE)
314 transposedletter (word); /* transposition */
315 if (m_pcount < MAXPOSSIBLE)
316 extraletter (word); /* insertion */
317 if (m_pcount < MAXPOSSIBLE)
318 wrongletter (word); /* substitution */
319
320 if ((m_hashheader.compoundflag != COMPOUND_ANYTIME) &&
321 m_pcount < MAXPOSSIBLE)
322 missingspace (word); /* two words */
323
324}
325
326/*
327 * \param word
328 *
329 * \return
330 */
331int
332ISpellChecker::insert (ichar_t *word)
333{
334 int i;
335 char * realword;
336
337 realword = ichartosstr (word, 0);
338 for (i = 0; i < m_pcount; i++)
339 {
340 if (strcmp (m_possibilities[i], realword) == 0)
341 return (0);
342 }
343
344 strcpy (m_possibilities[m_pcount++], realword);
345 i = strlen (realword);
346 if (i > m_maxposslen)
347 m_maxposslen = i;
348 if (m_pcount >= MAXPOSSIBLE)
349 return (-1);
350 else
351 return (0);
352}
353
354#ifndef NO_CAPITALIZATION_SUPPORT
355/*
356 * \param word
357 */
358void
359ISpellChecker::wrongcapital (ichar_t *word)
360{
361 ichar_t newword[INPUTWORDLEN + MAXAFFIXLEN];
362
363 /*
364 ** When the third parameter to "good" is nonzero, it ignores
365 ** case. If the word matches this way, "ins_cap" will recapitalize
366 ** it correctly.
367 */
368 if (good (word, 0, 1, 0, 0))
369 {
370 icharcpy (newword, word);
371 upcase (newword);
372 ins_cap (newword, word);
373 }
374}
375#endif
376
377/*
378 * \param word
379 */
380void
381ISpellChecker::wrongletter (ichar_t *word)
382{
383 int i;
384 int j;
385 int n;
386 ichar_t savechar;
387 ichar_t newword[INPUTWORDLEN + MAXAFFIXLEN];
388
389 n = icharlen (word);
390 icharcpy (newword, word);
391#ifndef NO_CAPITALIZATION_SUPPORT
392 upcase (newword);
393#endif
394
395 for (i = 0; i < n; i++)
396 {
397 savechar = newword[i];
398 for (j=0; j < m_Trynum; ++j)
399 {
400 if (m_Try[j] == savechar)
401 continue;
402 else if (isboundarych (m_Try[j]) && (i == 0 || i == n - 1))
403 continue;
404 newword[i] = m_Try[j];
405 if (good (newword, 0, 1, 0, 0))
406 {
407 if (ins_cap (newword, word) < 0)
408 return;
409 }
410 }
411 newword[i] = savechar;
412 }
413}
414
415/*
416 * \param word
417 */
418void
419ISpellChecker::extraletter (ichar_t *word)
420{
421 ichar_t newword[INPUTWORDLEN + MAXAFFIXLEN];
422 ichar_t * p;
423 ichar_t * r;
424
425 if (icharlen (word) < 2)
426 return;
427
428 icharcpy (newword, word + 1);
429 for (p = word, r = newword; *p != 0; )
430 {
431 if (good (newword, 0, 1, 0, 0))
432 {
433 if (ins_cap (newword, word) < 0)
434 return;
435 }
436 *r++ = *p++;
437 }
438}
439
440/*
441 * \param word
442 */
443void
444ISpellChecker::missingletter (ichar_t *word)
445{
446 ichar_t newword[INPUTWORDLEN + MAXAFFIXLEN + 1];
447 ichar_t * p;
448 ichar_t * r;
449 int i;
450
451 icharcpy (newword + 1, word);
452 for (p = word, r = newword; *p != 0; )
453 {
454 for (i = 0; i < m_Trynum; i++)
455 {
456 if (isboundarych (m_Try[i]) && r == newword)
457 continue;
458 *r = m_Try[i];
459 if (good (newword, 0, 1, 0, 0))
460 {
461 if (ins_cap (newword, word) < 0)
462 return;
463 }
464 }
465 *r++ = *p++;
466 }
467 for (i = 0; i < m_Trynum; i++)
468 {
469 if (isboundarych (m_Try[i]))
470 continue;
471 *r = m_Try[i];
472 if (good (newword, 0, 1, 0, 0))
473 {
474 if (ins_cap (newword, word) < 0)
475 return;
476 }
477 }
478}
479
480/*
481 * \param word
482 */
483void ISpellChecker::missingspace (ichar_t *word)
484{
485 ichar_t firsthalf[MAX_CAPS][INPUTWORDLEN + MAXAFFIXLEN];
486 int firstno; /* Index into first */
487 ichar_t * firstp; /* Ptr into current firsthalf word */
488 ichar_t newword[INPUTWORDLEN + MAXAFFIXLEN + 1];
489 int nfirsthalf; /* No. words saved in 1st half */
490 int nsecondhalf; /* No. words saved in 2nd half */
491 ichar_t * p;
492 ichar_t secondhalf[MAX_CAPS][INPUTWORDLEN + MAXAFFIXLEN];
493 int secondno; /* Index into second */
494
495 /*
496 ** We don't do words of length less than 3; this keeps us from
497 ** splitting all two-letter words into two single letters. We
498 ** also don't do maximum-length words, since adding the space
499 ** would exceed the size of the "possibilities" array.
500 */
501 nfirsthalf = icharlen (word);
502 if (nfirsthalf < 3 || nfirsthalf >= INPUTWORDLEN + MAXAFFIXLEN - 1)
503 return;
504 icharcpy (newword + 1, word);
505 for (p = newword + 1; p[1] != '\0'; p++)
506 {
507 p[-1] = *p;
508 *p = '\0';
509 if (good (newword, 0, 1, 0, 0))
510 {
511 /*
512 * Save_cap must be called before good() is called on the
513 * second half, because it uses state left around by
514 * good(). This is unfortunate because it wastes a bit of
515 * time, but I don't think it's a significant performance
516 * problem.
517 */
518 nfirsthalf = save_cap (newword, word, firsthalf);
519 if (good (p + 1, 0, 1, 0, 0))
520 {
521 nsecondhalf = save_cap (p + 1, p + 1, secondhalf);
522 for (firstno = 0; firstno < nfirsthalf; firstno++)
523 {
524 firstp = &firsthalf[firstno][p - newword];
525 for (secondno = 0; secondno < nsecondhalf; secondno++)
526 {
527 *firstp = ' ';
528 icharcpy (firstp + 1, secondhalf[secondno]);
529 if (insert (firsthalf[firstno]) < 0)
530 return;
531 *firstp = '-';
532 if (insert (firsthalf[firstno]) < 0)
533 return;
534 }
535 }
536 }
537 }
538 }
539}
540
541/*
542 * \param word
543 * \param pfxopts Options to apply to prefixes
544 */
545int
546ISpellChecker::compoundgood (ichar_t *word, int pfxopts)
547{
548 ichar_t newword[INPUTWORDLEN + MAXAFFIXLEN];
549 ichar_t * p;
550 ichar_t savech;
551 long secondcap; /* Capitalization of 2nd half */
552
553 /*
554 ** If compoundflag is COMPOUND_NEVER, compound words are never ok.
555 */
556 if (m_hashheader.compoundflag == COMPOUND_NEVER)
557 return 0;
558 /*
559 ** Test for a possible compound word (for languages like German that
560 ** form lots of compounds).
561 **
562 ** This is similar to missingspace, except we quit on the first hit,
563 ** and we won't allow either member of the compound to be a single
564 ** letter.
565 **
566 ** We don't do words of length less than 2 * compoundmin, since
567 ** both halves must at least compoundmin letters.
568 */
569 if (icharlen (word) < 2 * m_hashheader.compoundmin)
570 return 0;
571 icharcpy (newword, word);
572 p = newword + m_hashheader.compoundmin;
573 for ( ; p[m_hashheader.compoundmin - 1] != 0; p++)
574 {
575 savech = *p;
576 *p = 0;
577 if (good (newword, 0, 0, pfxopts, FF_COMPOUNDONLY))
578 {
579 *p = savech;
580 if (good (p, 0, 1, FF_COMPOUNDONLY, 0)
581 || compoundgood (p, FF_COMPOUNDONLY))
582 {
583 secondcap = whatcap (p);
584 switch (whatcap (newword))
585 {
586 case ANYCASE:
587 case CAPITALIZED:
588 case FOLLOWCASE: /* Followcase can have l.c. suffix */
589 return secondcap == ANYCASE;
590 case ALLCAPS:
591 return secondcap == ALLCAPS;
592 }
593 }
594 }
595 else
596 *p = savech;
597 }
598 return 0;
599}
600
601/*
602 * \param word
603 */
604void
605ISpellChecker::transposedletter (ichar_t *word)
606{
607 ichar_t newword[INPUTWORDLEN + MAXAFFIXLEN];
608 ichar_t * p;
609 ichar_t temp;
610
611 icharcpy (newword, word);
612 for (p = newword; p[1] != 0; p++)
613 {
614 temp = *p;
615 *p = p[1];
616 p[1] = temp;
617 if (good (newword, 0, 1, 0, 0))
618 {
619 if (ins_cap (newword, word) < 0)
620 return;
621 }
622 temp = *p;
623 *p = p[1];
624 p[1] = temp;
625 }
626}
627
636int
637ISpellChecker::ins_cap (ichar_t *word, ichar_t *pattern)
638{
639 int i; /* Index into savearea */
640 int nsaved; /* No. of words saved */
641 ichar_t savearea[MAX_CAPS][INPUTWORDLEN + MAXAFFIXLEN];
642
643 nsaved = save_cap (word, pattern, savearea);
644 for (i = 0; i < nsaved; i++)
645 {
646 if (insert (savearea[i]) < 0)
647 return -1;
648 }
649 return 0;
650}
651
661int
662ISpellChecker::save_cap (ichar_t *word, ichar_t *pattern,
663 ichar_t savearea[MAX_CAPS][INPUTWORDLEN + MAXAFFIXLEN])
664{
665 int hitno; /* Index into hits array */
666 int nsaved; /* Number of words saved */
667 int preadd; /* No. chars added to front of root */
668 int prestrip; /* No. chars stripped from front */
669 int sufadd; /* No. chars added to back of root */
670 int sufstrip; /* No. chars stripped from back */
671
672 if (*word == 0)
673 return 0;
674
675 for (hitno = m_numhits, nsaved = 0; --hitno >= 0 && nsaved < MAX_CAPS; )
676 {
677 if (m_hits[hitno].prefix)
678 {
679 prestrip = m_hits[hitno].prefix->stripl;
680 preadd = m_hits[hitno].prefix->affl;
681 }
682 else
683 prestrip = preadd = 0;
684 if (m_hits[hitno].suffix)
685 {
686 sufstrip = m_hits[hitno].suffix->stripl;
687 sufadd = m_hits[hitno].suffix->affl;
688 }
689 else
690 sufadd = sufstrip = 0;
691 save_root_cap (word, pattern, prestrip, preadd,
692 sufstrip, sufadd,
693 m_hits[hitno].dictent, m_hits[hitno].prefix, m_hits[hitno].suffix,
694 savearea, &nsaved);
695 }
696 return nsaved;
697}
698
699/*
700 * \param word
701 * \param pattern
702 * \param prestrip
703 * \param preadd
704 * \param sufstrip
705 * \param sufadd
706 * \param firstdent
707 * \param pfxent
708 * \param sufent
709 *
710 * \return
711 */
712int
713ISpellChecker::ins_root_cap (ichar_t *word, ichar_t *pattern,
714 int prestrip, int preadd, int sufstrip, int sufadd,
715 struct dent *firstdent, struct flagent *pfxent, struct flagent *sufent)
716{
717 int i; /* Index into savearea */
718 ichar_t savearea[MAX_CAPS][INPUTWORDLEN + MAXAFFIXLEN];
719 int nsaved; /* Number of words saved */
720
721 nsaved = 0;
722 save_root_cap (word, pattern, prestrip, preadd, sufstrip, sufadd,
723 firstdent, pfxent, sufent, savearea, &nsaved);
724 for (i = 0; i < nsaved; i++)
725 {
726 if (insert (savearea[i]) < 0)
727 return -1;
728 }
729 return 0;
730}
731
732/* ARGSUSED */
746void
747ISpellChecker::save_root_cap (ichar_t *word, ichar_t *pattern,
748 int prestrip, int preadd, int sufstrip, int sufadd,
749 struct dent *firstdent, struct flagent *pfxent, struct flagent *sufent,
750 ichar_t savearea[MAX_CAPS][INPUTWORDLEN + MAXAFFIXLEN],
751 int * nsaved)
752{
753#ifndef NO_CAPITALIZATION_SUPPORT
754 struct dent * dent;
755#endif /* NO_CAPITALIZATION_SUPPORT */
756 int firstisupper;
757 ichar_t newword[INPUTWORDLEN + 4 * MAXAFFIXLEN + 4];
758#ifndef NO_CAPITALIZATION_SUPPORT
759 ichar_t * p;
760 int len;
761 int i;
762 int limit;
763#endif /* NO_CAPITALIZATION_SUPPORT */
764
765 if (*nsaved >= MAX_CAPS)
766 return;
767 icharcpy (newword, word);
768 firstisupper = myupper (pattern[0]);
769#ifdef NO_CAPITALIZATION_SUPPORT
770 /*
771 ** Apply the old, simple-minded capitalization rules.
772 */
773 if (firstisupper)
774 {
775 if (myupper (pattern[1]))
776 upcase (newword);
777 else
778 {
779 lowcase (newword);
780 newword[0] = mytoupper (newword[0]);
781 }
782 }
783 else
784 lowcase (newword);
785 icharcpy (savearea[*nsaved], newword);
786 (*nsaved)++;
787 return;
788#else /* NO_CAPITALIZATION_SUPPORT */
789#define flagsareok(dent) \
790 ((pfxent == NULL \
791 || TSTMASKBIT (dent->mask, pfxent->flagbit)) \
792 && (sufent == NULL \
793 || TSTMASKBIT (dent->mask, sufent->flagbit)))
794
795 dent = firstdent;
796 if ((dent->flagfield & (CAPTYPEMASK | MOREVARIANTS)) == ALLCAPS)
797 {
798 upcase (newword); /* Uppercase required */
799 icharcpy (savearea[*nsaved], newword);
800 (*nsaved)++;
801 return;
802 }
803 for (p = pattern; *p; p++)
804 {
805 if (mylower (*p))
806 break;
807 }
808 if (*p == 0)
809 {
810 upcase (newword); /* Pattern was all caps */
811 icharcpy (savearea[*nsaved], newword);
812 (*nsaved)++;
813 return;
814 }
815 for (p = pattern + 1; *p; p++)
816 {
817 if (myupper (*p))
818 break;
819 }
820 if (*p == 0)
821 {
822 /*
823 ** The pattern was all-lower or capitalized. If that's
824 ** legal, insert only that version.
825 */
826 if (firstisupper)
827 {
828 if (captype (dent->flagfield) == CAPITALIZED
829 || captype (dent->flagfield) == ANYCASE)
830 {
831 lowcase (newword);
832 newword[0] = mytoupper (newword[0]);
833 icharcpy (savearea[*nsaved], newword);
834 (*nsaved)++;
835 return;
836 }
837 }
838 else
839 {
840 if (captype (dent->flagfield) == ANYCASE)
841 {
842 lowcase (newword);
843 icharcpy (savearea[*nsaved], newword);
844 (*nsaved)++;
845 return;
846 }
847 }
848 while (dent->flagfield & MOREVARIANTS)
849 {
850 dent = dent->next;
851 if (captype (dent->flagfield) == FOLLOWCASE
852 || !flagsareok (dent))
853 continue;
854 if (firstisupper)
855 {
856 if (captype (dent->flagfield) == CAPITALIZED)
857 {
858 lowcase (newword);
859 newword[0] = mytoupper (newword[0]);
860 icharcpy (savearea[*nsaved], newword);
861 (*nsaved)++;
862 return;
863 }
864 }
865 else
866 {
867 if (captype (dent->flagfield) == ANYCASE)
868 {
869 lowcase (newword);
870 icharcpy (savearea[*nsaved], newword);
871 (*nsaved)++;
872 return;
873 }
874 }
875 }
876 }
877 /*
878 ** Either the sample had complex capitalization, or the simple
879 ** capitalizations (all-lower or capitalized) are illegal.
880 ** Insert all legal capitalizations, including those that are
881 ** all-lower or capitalized. If the prototype is capitalized,
882 ** capitalized all-lower samples. Watch out for affixes.
883 */
884 dent = firstdent;
885 p = strtosichar (dent->word, 1);
886 len = icharlen (p);
887 if (dent->flagfield & MOREVARIANTS)
888 dent = dent->next; /* Skip place-holder entry */
889 for ( ; ; )
890 {
891 if (flagsareok (dent))
892 {
893 if (captype (dent->flagfield) != FOLLOWCASE)
894 {
895 lowcase (newword);
896 if (firstisupper || captype (dent->flagfield) == CAPITALIZED)
897 newword[0] = mytoupper (newword[0]);
898 icharcpy (savearea[*nsaved], newword);
899 (*nsaved)++;
900 if (*nsaved >= MAX_CAPS)
901 return;
902 }
903 else
904 {
905 /* Followcase is the tough one. */
906 p = strtosichar (dent->word, 1);
907 memmove (
908 reinterpret_cast<char *>(newword + preadd),
909 reinterpret_cast<char *>(p + prestrip),
910 (len - prestrip - sufstrip) * sizeof (ichar_t));
911 if (myupper (p[prestrip]))
912 {
913 for (i = 0; i < preadd; i++)
914 newword[i] = mytoupper (newword[i]);
915 }
916 else
917 {
918 for (i = 0; i < preadd; i++)
919 newword[i] = mytolower (newword[i]);
920 }
921 limit = len + preadd + sufadd - prestrip - sufstrip;
922 i = len + preadd - prestrip - sufstrip;
923 p += len - sufstrip - 1;
924 if (myupper (*p))
925 {
926 for (p = newword + i; i < limit; i++, p++)
927 *p = mytoupper (*p);
928 }
929 else
930 {
931 for (p = newword + i; i < limit; i++, p++)
932 *p = mytolower (*p);
933 }
934 icharcpy (savearea[*nsaved], newword);
935 (*nsaved)++;
936 if (*nsaved >= MAX_CAPS)
937 return;
938 }
939 }
940 if ((dent->flagfield & MOREVARIANTS) == 0)
941 break; /* End of the line */
942 dent = dent->next;
943 }
944 return;
945#endif /* NO_CAPITALIZATION_SUPPORT */
946}
947
948

tdespell2

Skip menu "tdespell2"
  • Main Page
  • Namespace List
  • Class Hierarchy
  • Alphabetical List
  • Class List
  • File List
  • Class Members

tdespell2

Skip menu "tdespell2"
  • arts
  • dcop
  • dnssd
  • interfaces
  •   kspeech
  •     interface
  •     library
  •   tdetexteditor
  • kate
  • kded
  • kdoctools
  • kimgio
  • kjs
  • libtdemid
  • libtdescreensaver
  • tdeabc
  • tdecmshell
  • tdecore
  • tdefx
  • tdehtml
  • tdeinit
  • tdeio
  •   bookmarks
  •   httpfilter
  •   kpasswdserver
  •   kssl
  •   tdefile
  •   tdeio
  •   tdeioexec
  • tdeioslave
  •   http
  • tdemdi
  •   tdemdi
  • tdenewstuff
  • tdeparts
  • tdeprint
  • tderandr
  • tderesources
  • tdespell2
  • tdesu
  • tdeui
  • tdeunittest
  • tdeutils
  • tdewallet
Generated for tdespell2 by doxygen 1.9.4
This website is maintained by Timothy Pearson.