-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add test cases for the chomp option, fix for undef $/, improve some docs #20
base: master
Are you sure you want to change the base?
Conversation
On 12/12/18 12:19 PM, Harald Jörg wrote:
This is a pull request for the CPAN Pull Request Challenge.
The main plan was to add some tests for the chomp option of
|read_file|, which are included. But as ever so often, this lead to
other discoveries...
|read_file| chomps the results only if they have been split into
"lines" according to the current value of |$/|. So I tried some
possible values.
|read_file| has never worked when |$/| is undefined, where the split
on an empty value would result in an array with two elements per
character of input, every other of them undefined. The PR changes this
so that with undefined |$/|, the whole file content will be returned
as the first and only element of the array. The test cases no.8 and
no.9 exercise that, and fail in the master branch.
this is not a good case IMO. if you want the whole file in a scalar,
just call read_file in scalar context. there is no need to ever set $/
to undef for that. if you want the whole file in the first element of an
array use the scalar function:
my @array = (scalar read_file( 'foo' ), scalar read_file( 'bar' ) ) ;
Processing fixed length records with e.g. |$/ = \1024| doesn't work
either, as it tries to split on a regex like |SCALAR(0x55c41c65fba0)|.
Apparently nobody has missed it so far, therefore in the current state
of the PR I just added to the docs that it isn't supported. If
desired, I'd work on it - it isn't too difficult.
read_file is to read the whole file. processing in fixed length records
is not a whole file. just use read for that.
uri
|
I agree! Neither of the settings for Of course, I'm open to requests for improvement or different decisions by the owners and will adjust the Pull Request as requested. -- haj |
On 12/12/18 2:16 PM, Harald Jörg wrote:
I agree! Neither of the settings for |$/| provides an interesting use
case for |File::Slurp|. However, it is in these cases where
|read_file| behaves different than plain Perl I/O, different in a way
which is impossible to predict without inspecting the code: For
example, they return arrays which contain undefined elements. I think
that the behavior should either be made similar to Perl I/O (which I
did for undefined |$/|), or at least documented (which I did for
integer references), or read_file should even defend itself against
these values by throwing an error. I didn't want to throw an error
because of the long long list of reverse dependencies of this module.
File::Slurp is one of the modules which can easily be used by Perl
beginners, who might not yet be aware of the dangers lurking when
global variables are changed at some place in the code....
but read_file is not supposed to be like perl i/o as it just reads a
whole file. perl i/o can do all sorts of things with setting $/ and
context and one line at a time. you can't compare them. the best i would
say is to document that you should not change $/ and expect perl i/o
behavior. read_file only uses the default $/ to split on lines and
nothing else. change $/ and you are on your own. more doc warnings are
ok by me. changing any behavior with regards to $/ is not ok.
uri
|
Well, at least paragraph mode ( This documentation can, of course, be adjusted to discourage any current value of |
On 12/12/18 3:37 PM, Harald Jörg wrote:
read_file only uses the default $/ to split on lines and nothing
else. change $/ and you are on your own.
Well, at least paragraph mode (|$/ = ''|) has been supported and
documented for a long time.
The docs say that /"In list context it will return a list of lines
(using the current value of $/ as the separator including support for
paragraph mode when it is set to '')."/
This documentation can, of course, be adjusted to discourage any
/current value of |$/|/ different from newline and the empty string,
and I'll remove the handling of undefined |$/| together with the
corresponding tests.
yep. so adding a warning to not change $/ to anything but a normal line
ending or to '' for paragraph mode would be a good thing.
uri
|
If the code permits a particular usecase, it should be tested for. If a
particular usecase is wrong, then the code should prohibit it by dying.
Documentation warnings are not enough. Murphy will always contrive the
strangest usecases possible.
…On Wed, Dec 12, 2018 at 1:23 PM perhunter ***@***.***> wrote:
On 12/12/18 3:37 PM, Harald Jörg wrote:
>
> read_file only uses the default $/ to split on lines and nothing
> else. change $/ and you are on your own.
>
> Well, at least paragraph mode (|$/ = ''|) has been supported and
> documented for a long time.
> The docs say that /"In list context it will return a list of lines
> (using the current value of $/ as the separator including support for
> paragraph mode when it is set to '')."/
>
> This documentation can, of course, be adjusted to discourage any
> /current value of |$/|/ different from newline and the empty string,
> and I'll remove the handling of undefined |$/| together with the
> corresponding tests.
>
>
yep. so adding a warning to not change $/ to anything but a normal line
ending or to '' for paragraph mode would be a good thing.
uri
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#20 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AASfy1z5596ir8bP6f6lPNJ_b9Yl2Srvks5u4XPWgaJpZM4ZQE8Q>
.
|
On 12/12/18 7:27 PM, Karen Etheridge wrote:
If the code permits a particular usecase, it should be tested for. If a
particular usecase is wrong, then the code should prohibit it by dying.
Documentation warnings are not enough. Murphy will always contrive the
strangest usecases possible.
how can you tell if the user really wants to mung $/ and use read_file?
you can't test or check for every possible idiotic use of code.
sometimes it is better to let the user learn it on their own.
documenting the issue is so you can always say rtfm. is setting $/ to
undef and slurping in a file to a scalar wrong? it should work as $/ is
not looked at there. paragraph mode only makes sense in a list context
anyway. there are too many possible rabbitholes to handle all possible
cases and know what the user really wants.
uri
|
This is a pull request for the CPAN Pull Request Challenge.
The main plan was to add some tests for the chomp option of
read_file
, which are included. But as ever so often, this lead to other discoveries...read_file
chomps the results only if they have been split into "lines" according to the current value of$/
. So I tried some possible values.read_file
has never worked when$/
is undefined, where the split on an empty value would result in an array with two elements per character of input, every other of them undefined. The PR changes this so that with undefined$/
, the whole file content will be returned as the first and only element of the array. The test cases no.8 and no.9 exercise that, and fail in the master branch.Processing fixed length records with e.g.
$/ = \1024
doesn't work either, as it tries to split on a regex likeSCALAR(0x55c41c65fba0)
. Apparently nobody has missed it so far, therefore in the current state of the PR I just added to the docs that it isn't supported. If desired, I'd work on it - it isn't too difficult.Finally, I fixed the docs where it states that the options can be given as a flattened hash, which isn't true for all of the writing operations.