aboutsummaryrefslogtreecommitdiff
path: root/lib/Lex/Lexer.cpp
AgeCommit message (Collapse)Author
2013-04-19[libclang] Make sure the preable does not truncate comments.Argyrios Kyrtzidis
rdar://13647445 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@179907 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-11Add -Wc99-compat warning for C11 unicode string and character literals.Richard Smith
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@176817 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-09When lexing in C11 mode, accept unicode character and string literals, per C11Richard Smith
6.4.4.4/1 and 6.4.5/1. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@176780 91177308-0d34-0410-b5e6-96231b3b80d8
2013-03-05Preprocessor: don't consider // to be a line comment in -E -std=c89 mode.Jordan Rose
It's beneficial when compiling to treat // as the start of a line comment even in -std=c89 mode, since it's not valid C code (with a few rare exceptions) and is usually intended as such. We emit a pedantic warning and then continue on as if line comments were enabled. This has been our behavior for quite some time. However, people use the preprocessor for things besides C source files. In today's prompting example, the input contains (unquoted) URLs, which contain // but should still be preserved. This change instructs the lexer to treat // as a plain token if Clang is in C90 mode and generating preprocessed output rather than actually compiling. <rdar://problem/13338743> git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@176526 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-21Preprocessor: preserve whitespace in -traditional-cpp mode.Jordan Rose
Note that unlike GNU cpp we currently do not preserve whitespace in macros (even in -traditional-cpp mode). <rdar://problem/12897179> git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@175778 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-09Properly validate UCNs for C99 and C++03 (both more restrictive than C(++)11).Jordan Rose
Add warnings under -Wc++11-compat, -Wc++98-compat, and -Wc99-compat when a particular UCN is incompatible with a different standard, and -Wunicode when a UCN refers to a surrogate character in C++03. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@174788 91177308-0d34-0410-b5e6-96231b3b80d8
2013-02-08Pull Lexer's CharInfo table out for general use throughout Clang.Jordan Rose
Rewriting the same predicates over and over again is bad for code size and code maintainence. Using the functions in <ctype.h> is generally unsafe unless they are specified to be locale-independent (i.e. only isdigit and isxdigit). The next commit will try to clean up uses of <ctype.h> functions within Clang. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@174765 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-31Lexer: Don't warn about Unicode in preprocessor directives.Jordan Rose
This allows people to use Unicode in their #pragma mark and in macros that exist only to be string-ized. <rdar://problem/13107323&13121362> git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@174081 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-30Fix r173881 to properly skip invalid UTF-8 characters in raw lexing and -E.Jordan Rose
This caused hangs as we processed the same invalid byte over and over. <rdar://problem/13115651> git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@173959 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-30Move UTF conversion routines from clang/lib/Basic to llvm/lib/SupportDmitri Gribenko
This is required to use them in TableGen. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@173924 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-30Don't warn about Unicode characters in -E mode.Jordan Rose
People use the C preprocessor for things other than C files. Some of them have Unicode characters. We shouldn't warn about Unicode characters appearing outside of identifiers in this case. There's not currently a way for the preprocessor to tell if it's in -E mode, so I added a new flag, derived from the PreprocessorOutputOptions. This is only used by the Unicode warnings for now, but could conceivably be used by other warnings or even behavioral differences later. <rdar://problem/13107323> git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@173881 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-28PR15067 (again): Don't warn about UCNs in C90 if we're raw-lexing.Jordan Rose
Fixes a crash. Thanks, Richard. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@173701 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-27PR15067: Don't assert when a UCN appears in a C90 file.Jordan Rose
Unfortunately, we can't accept the UCN as an extension because we're required to treat it as two tokens for preprocessing purposes. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@173622 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-25Lexer.cpp: Fix a warning with ptrdiff_t on i686. [-Wsign-compare]NAKAMURA Takumi
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@173447 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-25Clarify comment: "diagnose" is better than "warn" when emitting an error.Jordan Rose
Thanks, Dmitri. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@173400 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-24Add a fixit for \U1234 -> \u1234.Jordan Rose
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@173371 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-24As an extension, treat Unicode whitespace characters as whitespace.Jordan Rose
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@173370 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-24Handle universal character names and Unicode characters outside of literals.Jordan Rose
This is a missing piece for C99 conformance. This patch handles UCNs by adding a '\\' case to LexTokenInternal and LexIdentifier -- if we see a backslash, we tentatively try to read in a UCN. If the UCN is not syntactically well-formed, we fall back to the old treatment: a backslash followed by an identifier beginning with 'u' (or 'U'). Because the spelling of an identifier with UCNs still has the UCN in it, we need to convert that to UTF-8 in Preprocessor::LookUpIdentifierInfo. Of course, valid code that does *not* use UCNs will see only a very minimal performance hit (checks after each identifier for non-ASCII characters, checks when converting raw_identifiers to identifiers that they do not contain UCNs, and checks when getting the spelling of an identifier that it does not contain a UCN). This patch also adds basic support for actual UTF-8 in the source. This is treated almost exactly the same as UCNs except that we consider stray Unicode characters to be mistakes and offer a fixit to remove them. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@173369 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-12Remove useless 'llvm::' qualifier from names like StringRef and others that areDmitri Gribenko
brought into 'clang' namespace by clang/Basic/LLVM.h git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@172323 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-07Pull the bulk of Lexer::MeasureTokenLength() out into a new function,Argyrios Kyrtzidis
Lexer::getRawToken(). No functionality change. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@171771 91177308-0d34-0410-b5e6-96231b3b80d8
2013-01-02s/CPlusPlus0x/CPlusPlus11/gRichard Smith
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@171367 91177308-0d34-0410-b5e6-96231b3b80d8
2012-12-04Sort all of Clang's files under 'lib', and fix up the broken headersChandler Carruth
uncovered. This required manually correcting all of the incorrect main-module headers I could find, and running the new llvm/utils/sort_includes.py script over the files. I also manually added quite a few missing headers that were uncovered by shuffling the order or moving headers up to be main-module-headers. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@169237 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-28Teach Lexer::getSpelling about raw string literals. Specifically, if a rawRichard Smith
string literal needs cleaning (because it contains line-splicing in the encoding prefix or in the ud-suffix), do not clean the section between the double-quotes -- that's the "raw" bit! git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@168776 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-17Fix crash on end-of-file after \ in a char literal, fixes PR14369.Nico Weber
This makes LexCharConstant() look more like LexStringLiteral(), which doesn't have this bug. Add tests for eof after \ for several other cases. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@168269 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-14Fix an assertion failure printing the unused-label fixit in files using CRLF ↵Eli Friedman
line endings. <rdar://problem/12639047>. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@167900 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-13Revert r167801, "[preprocessor] When #including something that contributes noDaniel Dunbar
tokens at all,". This change broke External/Nurbs in LLVM test-suite. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@167858 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-13UCNs in char literals are done (in LiteralSupport), remove FIXME. Expand UCN ↵Nico Weber
FIXME in LexNumericConstant. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@167818 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-13[preprocessor] When #including something that contributes no tokens at all,Argyrios Kyrtzidis
don't recursively continue lexing. This avoids a stack overflow with a sequence of many empty #includes. rdar://11988695 git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@167801 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-13In Lexer::LexTokenInternal, avoid code duplication; no functionality change.Argyrios Kyrtzidis
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@167800 91177308-0d34-0410-b5e6-96231b3b80d8
2012-11-11s/BCPLComment/LineComment/Nico Weber
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@167690 91177308-0d34-0410-b5e6-96231b3b80d8
2012-10-25Take into account that there may be a BOM at the beginning of the file,Argyrios Kyrtzidis
when computing the size of the precompiled preamble. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@166659 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-24StringRef'ize Preprocessor::CreateString().Dmitri Gribenko
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@164555 91177308-0d34-0410-b5e6-96231b3b80d8
2012-09-06Dont cast away const needlessly. Found by gcc48 -Wcast-qual.Roman Divacky
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@163325 91177308-0d34-0410-b5e6-96231b3b80d8
2012-08-31Make a bunch of methods on Lexer private.Eli Friedman
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@162970 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-30Lexer: remove dead stores. Found by Clang static analyzer!Dmitri Gribenko
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@160973 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-28Add warning flag -Winvalid-pp-token for preprocessing-tokens which haveRichard Smith
undefined behaviour, and move the diagnostic for '' from an Error into an ExtWarn in this group. This is important for some users of the preprocessor, and is necessary for gcc compatibility. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@159335 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-17Documentation cleanup:James Dennett
* Removed docs for Lexer::makeFileCharRange from Lexer.cpp, as they're in the header file; * Reworked the documentation for SkipBlockComment so that it doesn't confuse Doxygen's comment parsing; * Added another summary with \brief markup. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@158618 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-15[-E] Emit a rewritten _Pragma on its own line.Jordan Rose
1. Teach Lexer that pragma lexers are like macro expansions at EOF. 2. Treat pragmas like #define/#undef when printing. 3. If we just printed a directive, add a newline before any more tokens. (4. Miscellaneous cleanup in PrintPreprocessedOutput.cpp) PR10594 and <rdar://problem/11562490> (two separate related problems) git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@158571 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-15Documentation cleanup: escape backslashes in Doxygen comments.James Dennett
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@158552 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-15PR12717: Clang supports hexadecimal floating-point literals in all languageRichard Smith
modes. For languages other than C99/C11, this isn't quite a conforming extension, and for C++11, it breaks some reasonable code containing user-defined literals. In languages which don't officially have hexfloats, pare back this extension to only apply in cases where the token starts 0x and does not contain an underscore. The extension is still not quite conforming, but it's a lot closer now. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@158487 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-15Fix PR13065.David Blaikie
This condition (added in r158093) was overly conservative. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@158483 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-08Correct method name in comment: from LexRawToken to LexFromRawLexer, accordingDmitri Gribenko
to a change done long ago in r57393. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@158243 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-07Insert a space if necessary when suggesting CFBridgingRetain/Release.Jordan Rose
This was a problem for people who write 'return(result);' Also fix ARCMT's corresponding code, though there's no test case for this because implicit casts like this are rejected by the migrator for being ambiguous, and explicit casts have no problem. <rdar://problem/11577346> git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@158130 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-06Add a -rewrite-includes option, which is similar to -rewrite-macros, but ↵David Blaikie
only expands #include directives. Patch contributed by Lubos Lunak (l.lunax@suse.cz). Review by Matt Beaumont-Gay (matthewbg@google.com). git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@158093 91177308-0d34-0410-b5e6-96231b3b80d8
2012-06-06Escape \n and \r in doxycomment.David Blaikie
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@158091 91177308-0d34-0410-b5e6-96231b3b80d8
2012-05-18Lexer::ReadToEndOfLine: Only build the string if it's actually used and do ↵Benjamin Kramer
so in a less malloc-intensive way. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@157064 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-13Support -Wc++98-compat-pedantic as requested:Seth Cantrell
http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20120409/056126.html git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@154655 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-13C++11 no longer requires files to end with a newlineSeth Cantrell
git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@154643 91177308-0d34-0410-b5e6-96231b3b80d8
2012-04-07ext_reserved_user_defined_literal must not default to Error in ↵Francois Pichet
MicrosoftMode. Hence create ext_ms_reserved_user_defined_literal that doesn't default to Error; otherwise MSVC headers won't parse. Fixes PR12383. git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@154273 91177308-0d34-0410-b5e6-96231b3b80d8
2012-03-11Unify naming of LangOptions variable/get function across the Clang stack ↵David Blaikie
(Lex to AST). The member variable is always "LangOpts" and the member function is always "getLangOpts". Reviewed by Chris Lattner git-svn-id: https://llvm.org/svn/llvm-project/cfe/trunk@152536 91177308-0d34-0410-b5e6-96231b3b80d8