LinuxQuestions.org
Share your knowledge at the LQ Wiki.
Home Forums Tutorials Articles Register
Go Back   LinuxQuestions.org > Forums > Non-*NIX Forums > Programming
User Name
Password
Programming This forum is for all programming questions.
The question does not have to be directly related to Linux and any language is fair game.

Notices


Reply
  Search this Thread
Old 01-26-2010, 07:57 PM   #16
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 454Reputation: 454Reputation: 454Reputation: 454Reputation: 454

Quote:
Originally Posted by MTK358 View Post
Next, I don't understand extract_bracketed()'s third parameter.
???????????????

From the documentation:


Quote:
and a prefix pattern. As before, a missing prefix defaults to optional whitespace
So what exactly is not clear ?
 
Old 01-27-2010, 07:36 AM   #17
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443

Original Poster
Blog Entries: 3

Rep: Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723
So to extract the pattern "keyword { ... }", you would use this, right?

Code:
extract_bracketed($text, '{}', 'keyword\s*');
 
Old 01-27-2010, 08:50 AM   #18
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443

Original Poster
Blog Entries: 3

Rep: Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723
OK, I figured that out.

Now, how do I find every instance of 'keyword { ... }' in a string and store it separately, including the prefix?
 
Old 01-27-2010, 12:06 PM   #19
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 454Reputation: 454Reputation: 454Reputation: 454Reputation: 454
Quote:
Originally Posted by MTK358 View Post
So to extract the pattern "keyword { ... }", you would use this, right?

Code:
extract_bracketed($text, '{}', 'keyword\s*');
From reading the documentation this is what I understand. And it even works for me.
 
Old 01-27-2010, 12:11 PM   #20
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 454Reputation: 454Reputation: 454Reputation: 454Reputation: 454
Quote:
Originally Posted by MTK358 View Post
OK, I figured that out.

Now, how do I find every instance of 'keyword { ... }' in a string and store it separately, including the prefix?
The documentation tells you about $extracted, $remainder, $prefix, doesn't it ? And if you agree with me that it does, doesn't the $remainder item ring the bell ? I.e. don't you think $remainder can be fed as $text as many times as one wants ?
 
Old 01-27-2010, 01:08 PM   #21
sundialsvcs
LQ Guru
 
Registered: Feb 2004
Location: SE Tennessee, USA
Distribution: Gentoo, LFS
Posts: 10,691
Blog Entries: 4

Rep: Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947Reputation: 3947
You might find it more convenient to use one of the actual parsers that are available in the CPAN library.

You define a grammar for whatever language that you want to process, and the parser does all the heavy-lifting for you.

http://search.cpan.org ... it's your best friend.
 
Old 01-27-2010, 01:13 PM   #22
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443

Original Poster
Blog Entries: 3

Rep: Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723
I still think the easiest way would be to write my own simple, specialized parser for it, that would even understand C style comments and give more meaningful, compiler-like error messages.

But how do you iterate through the chars in a string in Perl?
 
Old 01-27-2010, 01:38 PM   #23
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 454Reputation: 454Reputation: 454Reputation: 454Reputation: 454
Quote:
Originally Posted by MTK358 View Post
I still think the easiest way would be to write my own simple, specialized parser for it, that would even understand C style comments and give more meaningful, compiler-like error messages.

But how do you iterate through the chars in a string in Perl?
perldoc -f substr
perldoc -f length
.

Plus remember that $prefix is returned too.

And no, don't reinvent the wheel - so far Text::Balanced has all you need, and you'll have to add minimum glue code.

There is Perl code understanding "C" comments around, and it's a FAQ.

Also, read GNU CPP documentation

(
http://gcc.gnu.org/onlinedocs/cpp/
http://tigcc.ticalc.org/doc/cpp.html
)
- for me GNU CPP is the default tool for getting rid of "C" comments.
 
Old 01-27-2010, 01:44 PM   #24
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443

Original Poster
Blog Entries: 3

Rep: Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723
Quote:
Originally Posted by Sergei Steshenko View Post
perldoc -f substr
perldoc -f length
.

Plus remember that $prefix is returned too.

And no, don't reinvent the wheel - so far Text::Balanced has all you need, and you'll have to add minimum glue code.

There is Perl code understanding "C" comments around, and it's a FAQ.

Also, read GNU CPP documentation

(
http://gcc.gnu.org/onlinedocs/cpp/
http://tigcc.ticalc.org/doc/cpp.html
)
- for me GNU CPP is the default tool for getting rid of "C" comments.
Yeah, I am trying to make my own paraser and it just seems to get quite ugly fast, so maybe I should return to Text::Balanced.

And I didn't know that you can use CPP to strip out C-style comments. I'll read on it and see if I can get it to work.
 
Old 01-27-2010, 01:56 PM   #25
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443

Original Poster
Blog Entries: 3

Rep: Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723
Basically, I wonder how to make CPP only remove comments and merge lines ending with backslashes, but not process macros, includes, etc.?
 
Old 01-27-2010, 02:04 PM   #26
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 454Reputation: 454Reputation: 454Reputation: 454Reputation: 454
Quote:
Originally Posted by MTK358 View Post
Basically, I wonder how to make CPP only remove comments and merge lines ending with backslashes, but not process macros, includes, etc.?
The answer is here: http://gcc.gnu.org/onlinedocs/cpp/In...tml#Invocation .
 
Old 01-27-2010, 03:19 PM   #27
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443

Original Poster
Blog Entries: 3

Rep: Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723
I can't seem to figure it out from there.
 
Old 01-27-2010, 03:50 PM   #28
Sergei Steshenko
Senior Member
 
Registered: May 2005
Posts: 4,481

Rep: Reputation: 454Reputation: 454Reputation: 454Reputation: 454Reputation: 454
Quote:
Originally Posted by MTK358 View Post
I can't seem to figure it out from there.
Really ? Did you read ? Did you look for all occurrences of the word "comment" ?

...

Let me tell you something. When I studied English, I pretty quickly discovered that the appropriate meaning of an unknown to me word was not among the first meanings given by the dictionary.

The same applies to SW documentation - often the needed info is not in the beginning. According to my understanding, it is possible to suppress macro expansion while still processing comments.
 
Old 01-27-2010, 03:56 PM   #29
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443

Original Poster
Blog Entries: 3

Rep: Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723
Using the browser's find function, I discovered "-fpreprocessed" may do the job. But the problem is that it doesn't splice escaped newlines.

EDIT: that might not be an issue, I can probably splice escaped newlines in Perl using s/\\\n//g.

EDIT2: I've tested the splicing trick, and it seems to work just as described in the CPP manual.

Now, how do you find all instances of "keyword { ... }" and process them with Text::Balanced?.

Last edited by MTK358; 01-27-2010 at 04:03 PM.
 
Old 01-27-2010, 08:09 PM   #30
MTK358
LQ 5k Club
 
Registered: Sep 2009
Posts: 6,443

Original Poster
Blog Entries: 3

Rep: Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723Reputation: 723
Anyway, here is my current code, it first slurps the file, splices escaped newlines, writes it to a new file with the extension changes to ".c", runs cpp to extract the comments and write to a temp file, and then gets rid of the temp file.

The remarkable thing is that it worked perfectly the first try!!!

Code:
#!/usr/bin/env perl

foreach $filename (@ARGV) {
	open(INFILE, "<$filename");
	undef $/;
	$file = <INFILE>;
	close(INFILE);
	
	$file =~ s/\\\n//g;
	
	$outfilename = $filename;
	$outfilename =~ s/(.*)\..*/\1.c/;
	open(OUTFILE, ">$outfilename");
	print OUTFILE $file;
	close(OUTFILE);
	
	system("cpp -fpreprocessed $outfilename -o $outfilename.temp");
	system("rm $outfilename");
	system("mv $outfilename.temp $outfilename");
}
 
  


Reply



Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off



Similar Threads
Thread Thread Starter Forum Replies Last Post
checking for XML::Parser... configure: error: XML::Parser perl module is required for kornerr Linux - General 11 11-16-2008 07:24 AM
perl xml::parser dirhandle problem theshark Linux - Software 0 03-16-2006 06:45 PM
XML::Parser perl module is required for intltool, for LogJam GT_Onizuka Linux - Newbie 7 06-30-2005 07:49 AM
XML::Parser perl module is required farzan Linux - Software 8 09-26-2004 05:54 AM
XML::Parser perl mod is req 4 intltool error BorisMcHack Slackware 4 06-23-2004 07:51 AM

LinuxQuestions.org > Forums > Non-*NIX Forums > Programming

All times are GMT -5. The time now is 12:24 AM.

Main Menu
Advertisement
My LQ
Write for LQ
LinuxQuestions.org is looking for people interested in writing Editorials, Articles, Reviews, and more. If you'd like to contribute content, let us know.
Main Menu
Syndicate
RSS1  Latest Threads
RSS1  LQ News
Twitter: @linuxquestions
Open Source Consulting | Domain Registration