I Solved it :D_the "recurrent_directory_iterator" inside std::filesystem Doesn't support files with "Unicode Characters"

Discussion related to the LuxCore functionality, implementations and API.
Post Reply
User avatar
a1-kh
Posts: 33
Joined: Wed Nov 06, 2019 1:55 pm
Contact:

I Solved it :D_the "recurrent_directory_iterator" inside std::filesystem Doesn't support files with "Unicode Characters"

Post by a1-kh »

I do believe that the problem is with the way "std::recursive_directory_iterator" handles the files.
I tried to only print the files paths from the iterator with :

cout << file.path().string();

and it printed the files until it reached a file with Unicode Characters and crashed.
I tried :

cout << file.path().u8string()

.. it didn't crash , but the Unicode Characters are messed up completely.
I tried :

wcout << file.path().u8string()

.. no luck
I tried :

wcout << file.path().u16string()
wcout << file.path().u32string()

.. u16string() and u32string() doesn't exists
the error :
Unhandled exception at 0x00007FFB078C3E49 in ConsoleApplication1.exe: Microsoft C++ exception: std::system_error at memory location 0x000000DD992FE8D8.
here in filesystem :
[[noreturn]] inline void _Throw_system_error_from_std_win_error(const __std_win_error _Errno) {
_THROW(system_error{_Make_ec(_Errno)});
}

I forgot also to mention that i tried "std::wcout << file.path().wstring();" yesterday, but that didn't work too cause it skipped the Unicode files all together .
Can anyone help me please ?
Last edited by a1-kh on Sat Jan 16, 2021 7:58 pm, edited 1 time in total.
User avatar
a1-kh
Posts: 33
Joined: Wed Nov 06, 2019 1:55 pm
Contact:

Re: the "recurrent_directory_iterator" inside std::filesystem Doesn't support files with "Unicode Characters"

Post by a1-kh »

.
.
actually i do believe that the problem is with the way "std::recursive_directory_iterator" handles the files.

I tried to only print the files paths from the iterator with :
cout << file.path().string();
and it printed the files until it reached a file with Unicode Characters and crashed.

I tried : cout << file.path().u8string()
.. it didn't crash , but the Unicode Characters are messed up completely.

I tried : wcout << file.path().u8string()
.. no luck

I tried :wcout << file.path().u16string()
wcout << file.path().u32string()
.. u16string() and u32string() doesn't exists

I tried : std::wcout << file.path().wstring()
.. the program skipped the Unicode files all together
User avatar
TAO
Developer
Developer
Posts: 850
Joined: Sun Mar 24, 2019 4:49 pm
Location: France
Contact:

Re: the "recurrent_directory_iterator" inside std::filesystem Doesn't support files with "Unicode Characters"

Post by TAO »

I faced the same issue for a long time, and I manage to fix it by converting all characters, symbols, and separators to standard characters to be sure it's working as it supposed to.
User avatar
a1-kh
Posts: 33
Joined: Wed Nov 06, 2019 1:55 pm
Contact:

Re: the "recurrent_directory_iterator" inside std::filesystem Doesn't support files with "Unicode Characters"

Post by a1-kh »

TAO wrote: Sun Dec 20, 2020 7:17 pm I faced the same issue for a long time, and I manage to fix it by converting all characters, symbols, and separators to standard characters to be sure it's working as it supposed to.
Hi TAO , thanks for the reply. I tried to learn C++ , it is fun but a real pain for these simple things.
Do you believe there is no other solution ? because it is not logical. C++ is world wide famous, but a simple thing like Unicode file names is not doable ?
User avatar
TAO
Developer
Developer
Posts: 850
Joined: Sun Mar 24, 2019 4:49 pm
Location: France
Contact:

Re: the "recurrent_directory_iterator" inside std::filesystem Doesn't support files with "Unicode Characters"

Post by TAO »

To be honest, it's not that hard, you can create a function pas the text in it and replace any unwanted character with a normal character or symbol like "_"
That's my code in MaxLuxCore.

Code: Select all

std::string MaxToLuxUtils::removeChars(std::string& str)
{
	str.erase(remove_if(str.begin(), str.end(), isspace), str.end());
	str.erase(remove_if(str.begin(), str.end(), isblank), str.end());
	std::replace_if(str.begin(), str.end(), ispunct, '_');

	return str;
}
And use it like this :

Code: Select all

std::string mapName = getTexturePathName(refno, material);  // or any other string
MaxToLux->removeChars(mapName);  // pass string to function and return new string
and to get the path or name in the first place use :

Code: Select all

std::string MaxToLuxMaterials::getTexturePathName(int paramID, ::Mtl* mat)
{
	BitmapTex *bmt = (BitmapTex*)tex;
	BitmapInfo bi(bmt->GetMapName());

	if (tex->GetName() != NULL)
	{
		std::string texName = tex->GetName().ToCStr();
		MaxToLux->removeChars(texName);
		return texName;
	}
}
use a combination of this code and you will be fine.
of course, you can check for any character other than standard lowercase and uppercase character and numbers and replace it exactly in the same way. but in my case, I did that by using 3dsmax language manager and I think it's unnecessary for your purpose too.
I hope that helps you.
User avatar
TAO
Developer
Developer
Posts: 850
Joined: Sun Mar 24, 2019 4:49 pm
Location: France
Contact:

Re: the "recurrent_directory_iterator" inside std::filesystem Doesn't support files with "Unicode Characters"

Post by TAO »

if you targeting windows, you can also, use "boost" or any other library even c++ 11 to convert UTF format for example:

Code: Select all

#if defined(WIN32)
	boost::filesystem::path::imbue(
			std::locale(std::locale(), new std::codecvt_utf8_utf16<wchar_t>()));
#endif
It will convert UTF-16 to UTF-8 on Windows OS as in LuxRays/LuxCore all file names are UTF-8 encoded. This works fine on Linux/macOS but windows require conversion to UTF-16.

Personally use both different methods for this purpose, first convert to the right UTF encoding system then remove any unwanted character just to avoid problems and crash second part is really helpful with images In maps and such things.
User avatar
a1-kh
Posts: 33
Joined: Wed Nov 06, 2019 1:55 pm
Contact:

Re: I Solved it :D_the "recurrent_directory_iterator" inside std::filesystem Doesn't support files with "Unicode Charact

Post by a1-kh »

I finally cracked this case , and I am very sleepy right now :( , no sleep for straight two days :(

just to elaborate on the the problem, I finally found the correct way to use Unicode characters in windows through the command line, with full support for reading, writing and processing.

Thanks TAO for the advice, but it was not enough for me and I had to find the correct way to do it on windows with the command line.
I created a video to explain it with an example, but because the solution is big to implement I decided to split it on more than one video.

I tried to make it interesting as much as I could, hope you find it useful guys :lol: :roll: .
Don't mind my voice, I recorded with a sleepy voice :? :?
Here is the link :

https://www.youtube.com/watch?v=070k_uocw0M
User avatar
TAO
Developer
Developer
Posts: 850
Joined: Sun Mar 24, 2019 4:49 pm
Location: France
Contact:

Re: I Solved it :D_the "recurrent_directory_iterator" inside std::filesystem Doesn't support files with "Unicode Charact

Post by TAO »

That was a different approach but I don't think that's the way you can do it everywhere especially if you consider C++ 11 or 14 as a standard.
Also, remember if you can not see a character, many other applications and also LuxCore may not be able to see that correctly too. it's not just about showing or see everything it's mostly about making it correct that's why I replace unreadable characters with readable characters.
You can do the replace "\" with "/" like this:

Code: Select all

std::string str = "c:\temp\folder\folder 2\";  // dynamic or static path
std::replace(str.begin(), str.end(), '\', '/');
The answer will be like this:

Code: Select all

c:/temp/folder/folder 2/
By the way, that was a nice video tutorial.
User avatar
a1-kh
Posts: 33
Joined: Wed Nov 06, 2019 1:55 pm
Contact:

Re: I Solved it :D_the "recurrent_directory_iterator" inside std::filesystem Doesn't support files with "Unicode Charact

Post by a1-kh »

Thank you, glad you liked it :D :roll: . I tried to make funny but I was soooo sleepy. 2 days without sleep, and I slept for 14 Hours after it :( :( .
Yes, your are correct this was a different approach. My goal was to not change the names on my hard drive, cause it would be a real pain to change them manually before processing then :? :cry: .
But yes, when someone wants to program for LuxCore and other programs then it would be loads easier to change the names to the normal character code. It would make your life much much easier :!:, and one wouldn't have to go through this rabbit hole that I went through :lol: :idea: .

By the way, I didn't have the chance to see MaxToLux, I will try to see it shortly but I am pretty sure that you did a great job within it. And the new image of the installer, it looks awesome :) .
User avatar
TAO
Developer
Developer
Posts: 850
Joined: Sun Mar 24, 2019 4:49 pm
Location: France
Contact:

Re: I Solved it :D_the "recurrent_directory_iterator" inside std::filesystem Doesn't support files with "Unicode Charact

Post by TAO »

Nice, New MaxToLux is much better with handling issues and more stable. If you test it and see any issues, try to report them.
I'm working to fix a few issues that already reported by other users.
Post Reply