How to Extract Strings From Source Codes?

I’d developed a simple online tool (Using Node.JS) to extract Chinese strings and merge translations automatically before, in a global mail project.

The tool doesn’t care about the source languages. it extracts non-ascii line-based strings, skipping something else - such as HTML codes.

It’s a simple aid tool only, not much time to improve.

I’m always searching open source tools to extract strings from all kinds of source codes, but it seems no such a stuff.

Last month, I found one: the highlightjs!

Yes, it knows lots of programming languages - that’s what I’m looking for.

I try to update the core js to testing the idea - it works: Using syntax highlight library to extract strings.

It’s great!