Wrox Programmer Forums
Go Back   Wrox Programmer Forums > Open Source > Perl
|
Welcome to the p2p.wrox.com Forums.

You are currently viewing the Perl section of the Wrox Programmer to Programmer discussions. This is a community of software programmers and website developers including Wrox book authors and readers. New member registration was closed in 2019. New posts were shut off and the site was archived into this static format as of October 1, 2020. If you require technical support for a Wrox book please contact http://hub.wiley.com
 
Old April 15th, 2009, 01:39 AM
Authorized User
 
Join Date: Mar 2009
Posts: 30
Thanks: 0
Thanked 0 Times in 0 Posts
Default how to find the non-ascii character using perl

Dear all,

Pls. tell me how to find the non-english character by regular expression like
$line =~ m/[regular expression] /g;


example:
values of these surfactant–cobalt



Thanks,
Thava
 
Old April 21st, 2009, 11:44 AM
Friend of Wrox
 
Join Date: Dec 2003
Posts: 488
Thanks: 0
Thanked 3 Times in 3 Posts
Default

Code:
#!/usr/bin/perl
use warnings;
use strict;

while(<>) {
  tr/\000-\177//cd; 
  print;
}
__________________
--
Charlie Harvey's website - linux, perl, java, anarchism and punk rock: http://charlieharvey.org.uk
 
Old August 11th, 2011, 06:22 AM
Registered User
 
Join Date: Jul 2011
Posts: 9
Thanks: 2
Thanked 0 Times in 0 Posts
Default

Hi,

How to replace the non-asciii value into the XML number entities.

eg:

values of these surfactant–cobalt

output:

values of these surfactant &#x????; cobalt
 
Old November 13th, 2011, 10:24 PM
Friend of Wrox
 
Join Date: Dec 2003
Posts: 488
Thanks: 0
Thanked 3 Times in 3 Posts
Default

Would HTML::Entities work for you? If not something like:

Code:
#!/usr/bin/perl
use warnings;
use strict;
use utf8;
use 5.10.1;


binmode STDOUT, 'utf8'; # needed on some terminals where you don't defsault to utf8
my $unicode_string="vis-à-vis Beyoncé's naïve\npapier-mâché résumé";
say "Start: $unicode_string";
$unicode_string =~s/([^[:ascii:]])/'&#' . ord($1) . ';'/ge;
say "End: $unicode_string";
__________________
--
Charlie Harvey's website - linux, perl, java, anarchism and punk rock: http://charlieharvey.org.uk





Similar Threads
Thread Thread Starter Forum Replies Last Post
Get Ascii code for a character in XSLT mswin XSLT 8 November 28th, 2008 08:21 AM
ascii umeshtheone Beginning VB 6 2 June 12th, 2007 03:56 AM
find character % and format column paul20091968 Excel VBA 0 January 29th, 2007 05:45 AM
Get UNICODE or ASCII Value of a character Eyob_the_pro C# 0 January 10th, 2007 03:42 AM
Converting an ASCII character to binary? skyraider Visual Basic 2005 Basics 0 May 7th, 2006 08:20 PM





Powered by vBulletin®
Copyright ©2000 - 2020, Jelsoft Enterprises Ltd.
Copyright (c) 2020 John Wiley & Sons, Inc.