# Detecting Hebrew Characters in PHP Strings

In PHP, is there a known safe/reliable way to

1. Detect, generically, a hebrew character that's in a string of plain english characters.
2. Replace that character with something

I know I could, for a set of specific characters, use mb_ereg_replace to replace specific characters. However, I'm interested in being able to scan a string that might contain any hebrew character, and then replace it with things.

That is, I might have two strings like this

<?php
$string1 = "Look at this hebrew character: חַ. Isn't it great?";$string2 = "Look at this other hebrew character: יַָ. It is also great?";


I want a single function that would give me the following strings

Look at this hebrew character: \texthebrew{ח}. Isn't it great?
Look at this other hebrew character: \texthebrew{י}. It is also great?


In theory I know I could scan the string for characters in the hebrew UTF-8 range and detect those, but how character encoding on strings works in PHP has always been a little hazy for me, and I'd rather use a proven/known solution if such a thing exists.

The mb_ereg_replace_callback function is useful in your case. The regular expression dialect has support for named properties, the Hebrew property specifically. That is Hewbrew Unicode block (IntlChar::BLOCK_CODE_HEBREW).

All you need to do is to mask the Hebrew segments:

mbregex_encoding('utf-8');
var_dump(mb_ereg_replace_callback('\p{Hebrew}+', function($matches) { return vsprintf('\texthebrew{%s}',$matches);
}, \$subject));


Output:

string(65) "Look at this hebrew character: \texthebrew{חַ}. Isn't it great?"


As the output shows, the four bytes with the two code-points are properly wrapped in one segment.

I don't know of any other way to do that in PHP with that little code.