J3 / 97-203

Date:        30 July 1997
To:          J3
From:        R. Maine
Subject:     Command line argument argument rehash

A straw vote on the general direction of the command-line
argument MTE was taken at meeting 141.  That vote records 14 in
favor of the model of paper 97-153 ("POSIX" model), 4 in favor of
the model of papers 97-151 et al (everything-at-once model), 0 in
favor of neither model and 1 undecided.  Based on this, I had
expected the next paper to be "along the general line" of paper
97-153, with some fine tuning.

Paper 97-201 seems to completely miss this mark.  There were many
differences between the approaches of papers 151 and 153, not
just one.  Thats the whole reason I wrote paper 153 in the first
place - I felt there to be too many major differences to treat by
debating them one at a time.  I wanted instead to provide
something that I considered to be at least reasonably close so
that there would only be a few relatively small fine points to
debate and fine tune.

As best as I can tell, paper 97-201 adopts one and only one of
the ideas from paper 153, leaving everything else as in paper
151.  In at least one case, it goes even further by adopting an
idea that was advocated in neither paper 151 nor 153, but was
mentioned as a "straw man" in 153 as something that would be
"unreasonable" an explanation for why it made a different choice.
This really does not seem like a "fine tuning" along the general
direction of paper 153.  It looks more like paper 151 with one
token change as a concession to the vote.  I cannot regard this
as an "impartial" attempt to work along the direction of 153.

The introduction and the history sections of paper 97-201 almost
admit this in their categorization of paper 153 and of the vote
taken.  As the author of paper 153, I completely disagree with
its categorization as "the Iterative model".  I never used that
terminology in the paper; nor was that terminology used in the
straw vote or in the minutes.  The description captured in the
minutes of the straw vote ("POSIX" model) more accurately
captures its flavor of being similar to the POSIX binding, and
having lots of differences from paper 151.  (I note that paper
97-201 also incorrectly describes option 3 in the straw vote;
that option was for neither approach, which is not at all the
same as for having no feature at all; but since that option got
no votes, the only relevance is in illustrating how much paper
97-201 misses the point of that vote).

I propose that paper 97-201 just be dropped and that paper 97-153
be used as the basis for further work on this item.  That doesn't
mean that 97-153 is the final word, but only that further work
should pay at least reasonable attention to its direction and fine
tune a few points instead of throwing out almost the whole thing.

Specific points of paper 97-201 that I disagree with follow.  I'm
not going to try to give full arguments on all of these points;
see paper 97-153 for its explanations of the choices it made.  I
thought we already had that vote.  The object here is just to
point out in how many ways paper 97-201 follows the direction of
151 instead of 153.

1. Excessive reference to the command line model.

  Paper 153 specifically stated that it was avoiding the
  prejudicial use of the term "command line", adopting the more
  neutral term "system argument" that is equally meaningful
  interpreted as a command line argument or as some other
  mechanism.  Several e-mail comments from users were
  specifically supportive of this approach.

  Paper 201 (like 151) uses the prejudicial terminology both in
  its intrinsic names and in its description, and then tries to
  "recover" by saying that the system doesn't actually have to do
  it that way.  I think this far inferior (and tricky to word so
  that you don't just contradict yourself - I don't look forward
  to getting that wording right as editor).

2. Definitions of operating system terms.

  Paper 201 (like 151) proposes that the Fortran standard define
  the terms "operating system", "command line", and "command
  tail".  I think this a bad idea.

3. Discussion of delimitters.

  This is perhaps an offshoot of item 1.  Paper 201 (like 151)
  talks a lot about delimitters.  I don't think the word should
  even come up.

4. Operating system dependence versus processor dependence.

  Paper 201 (like 151) proposes that several things be "defined
  by the operating system, or processor dependent if there is no
  operating system definition."  This concept seems bizzare to
  me.  If it is processor dependent, just say so.  I don't think
  a distinction between operating system defined and processor
  dependent belongs in the standard.

5. Multiple character kinds

  Paper 201 (like 151) requires that the compiler support all
  character kinds for the character arguments.  Paper 153
  requires only support for default character kind, which I
  still feel to be the best approach.

6. Multiple integer kinds

  Paper 201 (like 151) requires that the compiler support all
  integer kinds for the integer arguments.  paper 153 requires
  only support for default integer kind, which I still feel to be
  the best approach.  It does not seem plausible that any of
  these values are likely to exceed the range of default integer.

7. Ill-defined character conversion.

  As part of its requirement for supporting multiple character
  kinds, paper 201 says "If the KIND type or character set of the
  target variable cannot represent the assigned value, an error
  occurs."  I have multiple objections to this.  Sorted from the
  most trivial wording problems to the most important conceptual
  issues: 1) The term "target" is inappropriate here.  2) If
  "KIND type or character set" is supposed to mean something
  other than just "type and kind", then I sure don't know what,
  but I probably don't like whatever new concept it is. 3) I
  don't know whether "an error occurs" means a compilation error
  (which is what you get elsewhere with type and kind mismatches),
  a run-time abort, the status variable returns an error code, or
  that the code is illegal so anything can happen.  4) Fortran has
  no concept of conversion between character kinds and this is far
  to obscure a place to be introducing such a major concept.

  And I'm sorry, but "ability to adapt to the likely consequences
  of the ISO/SC22 mandate for internationalization" doesn't cut
  it.  I see no evidence that this is related to any "likely
  consequence."  And even if it is, it will be much better to
  make such changes in an integrated manner instead of one
  intrinsic at a time.  Such a major concept deserves separate
  discussion and treatment as a major feature on its own - not a
  back-door introduction as a minor part of an MTE.

8. Naming conventions from the 60's

  Paper 201 (like 151) speciously prepends many of its argument
  names with the letter "I".  Looks a lot like I used to code in
  the 60s.  It is inconsistent with usage in simillar situations
  elsewhere in the standard.

9. Lack of module packaging

  Paper 151 said little or nothing about namespace pollution.
  Paper 153 suggested packaging these intrinsics in an intrinsic
  module, following the lead of the exception handling TR and the
  ISO_VARYING_STRING module.  Several comments recieved were
  supportive of this concept.  Paper 201 doesn't even mention why
  it was dropped, just ignoring the idea.

9. Usage of "ISO_" in intrinsic names

  In the process of discussing the concept of module packaging,
  paper 153 said "it would be unreasonable to require all new
  intrinsics to have names starting with ISO_".  Paper 201, in
  an apparent effort to disagree with 153 to the maximum extent
  possible, does exactly that, even though paper 151 hadn't.

10. No portable error detection.

  Paper 201 (like 151) provides no portable means of detecting
  whether an error has occurred.  Specifically, it does not
  define a status code value of zero to mean sucess. (Any other
  system-independent value would also do, but zero seems obvious
  and consistent with other areas of the standard)

  When I first noticed this in paper 151 (or perhaps its
  predecessor), I thought it must have been a trivial accidental
  omission.  It was beyond my meager powers of imagination to
  conceive that anyone would do this intentionally and would
  actively object to making it portable.  Subsequent discussion
  revealed the inadequacies of my imagination.  Another of my
  inadequacies is that I cannot come up with the words to convey
  how bad an idea I think this is.

11. System_argument_count as a subroutine instead of a function

  Its true that paper 153 did say that I could see arguments on
  both sides of this question, but it came down on the side of a
  function by giving the "deciding vote" to similarity with Posix
  1003.9.  I still think that's the overall best choice.
  Admitedly the arguments are not overwhelming.  If this were one
  of only a few changes to the proposal in paper 153, I could
  accept it as "refinement" (assuming, of course, that a majority
  agreed that it was).  As part of a paper that ignores almost
  everything in paper 153, it doesn't do so well.

  Paper 201 had no actual discussion of why this was thought to
  be better.  I would expect some such discussion of "refinements."

12. Noun form as subroutine name

  Although plenty of reasonable exceptions exist, I take it as a
  starting guideline that subroutine names should typically be
  verb forms and function names should be noun forms.  The name
  ISO_NCOMMAND_ARGUMENTS as a subroutine name violates this
  principle, and I don't think this is one of the good violations.
  If this is going to be a subroutine, it should have a verb form
  (likely GET_something_or_other).  Of course, I still think it
  better as a function, so maybe that aspect of the name is ok
  after all.

13. Missing underscore in subroutine name.

  I can understand abbreviating NUMBER to N (I do that one a lot
  myself), but my poor English-trained eyes have trouble seeing
  the N as anything but a typo when it is tacked onto the from of
  COMMAND with no delimitter.  Lets see, that's the 5th unrelated
  thing I dislike about this intrinsic name, without even getting
  into functionality: prepending with ISO_, use of the prejudicial
  term COMAND, wrong grammatical form for a subroutine, missing
  underscore, and I have a slight preference for the *_COUNT form
  over the N_* form.  The last is a fairly minor preference.
  My first choice is still something like SYSTEM_ARGUMENT_COUNT,
  but I could go with something more like N_SYSTEM_ARGUMENTS
  (both as functions, of course).

14. N instead of NUMBER as argument of get-system_argument

  I slightly prefer NUMBER instead of N because the other
  arguments are spelled out (LENGTH instead of L or LEN, etc.)
  This will probably be specified positionally most of the time
  anyway.  Debatable, but again, this doesn't debate it - just
  changes it without note.

15. Why isn't VALUE optional in get_system_argument?

  I specifically made VALUE optional.  Its not a big deal, but why
  not.  The optionality probably wouldn't be used a lot, but I can
  at least imagine a way to use it.  Is there some specific reason
  for this change or just part of a general rejection of everything
  from paper 153?

16. Poorly expressed semantics of length in get_system_argument

  After some study, I finally concluded that paper 153 probably
  intends the same semantics as I do for the lenth argument, but
  its expression is sure confusing (enough so that even the
  author deems an explanatory note necessary).  I think my words
  are a lot closer to useable as edits.  Also, I think it
  important to explicitly mention, as my paper did, but 201
  doesn't, what happens to the length argument if the string is
  too long for the value argument.  I think paper 201 intends the
  same thing, but I'd rather it be explicit.

17. Ignored posix semantics for invalid argument number

  My paper explicitly specified that an invalid argument number
  results in an error status code return.  I did this because
  I was told that this is how posix handled such an error.
  Paper 201 takes the approach of not bothering to mention the
  issue at all.  It niether restricts the value of N nor specifies
  what happens if its value is "nonsensical."

18, Included an unparsed_command_line subroutine

  Paper 153 gave arguments for why this capability should be left
  off.  At least a few other people have agreed with these
  arguments (including at least one or two that initially thought
  otherwised but changed their minds after considering the
  reasons I gave).  I'm not hard-over on this one, but paper 201
  fails to give even token response to my reasoning.  In fact, it
  illustrates the point by even explicitly suggesting one of the
  flaws I warned about.

  The analogy that paper 201 makes to date_and_time is flawed and
  illustrates that it fails to understand the fundamental issue.
  With date_and_time, the system either has a clock or it doesn't.
  There are two forms for retrieving the information, but you are
  guaranteed that it is available in either both forms if it is
  available at all (at least that's my reading - anyone disagree?)

  Paper 201 specifically proposes that the system can
  independently choose whether to make command line data available
  via the individual arguments or via the unparsed command line.
  (Anyway, it explicitly says that the processor could choose
  to return individual arguments, but return blanks for the
  unparsed command line; I think it is implied that the converse
  would also be allowed).

  Therefore, because the processor has so much choice, the
  programmer has none.  To write portable code in such a
  situation, the programmer would be *REQUIRED* to write the code
  to handle *BOTH* forms of retrieval.  He wouldn't get to choose
  the most convenient one (as he can with date_and_time).

  And, of course, this difficulty in writing portable code is
  exacerbated by the inability to portably tell whether a
  call succeeded or not (item 10 above).

  If we are going to include this capability, lets give at least
  some attention to portability.  Perhaps by requiring the
  processor to return the information in either both forms or
  neither (as we do in date_and_time).

19. Poor name choice for "unparsed_command_line"

  If the definition in paper 201 is retained, it is clear that the
  so-called unparsed command line might be quite heavily parsed
  indeed, and possibly otherwise manipulated and then stuck back
  together.  For example, the command line

      myprog ~maine/*.dat

  might well end up "unparsed" as

      /home/maine/file1.dat /home/maine/junk.dat /home/maine/stuff.dat

20. Separating command name from command "tail"

  Paper 153 suggested that if a routine to retrieve the whole
  command line were done that it should not attempt to separate
  the command name from the tail.  Paper 201, of course, goes the
  other way and helps illustrate why you don't want to do this.
  Aside from the difficulty of defining the tail in a
  system-independent manner, paper 201 is going to confuse the
  user that skims it.  A name like get_command_line already
  implies the whole command line (it doesn't say get_command_tail
  does it?).  Adding on "unparsed" only serves to emphasize that
  it really is the whole thing, because separating the name from
  the tail would be part of the parsing.

21. Get_program_name unnecessary

  I don't think this intrinsic is necessary at all.  Which is a
  good thing, because it is pretty ill defined.

22. Difference between name and value in get_program name

  I don't understand this difference.  I certainly don't think it
  is well-defined.  It uses all kinds of concepts that Fortran
  doesn't define, and some operating systems might not either.

  I'm not aware that there was necessarily such a thing as a
  "file name of the program that is executing."  I don't care
  to get into a discussion of what such a thing might or might
  not be.

23.  Trivial factual error repeated the 4th time.

  I know that my comments are being ignored when I see things
  like this.  In every version of Craig's paper he keeps referring
  to "the value which would be returned by an
  INQUIRE(FILE=file-name-expr) statement".  This is complete and
  utter nonsense.  I am not talking about a matter of judgement
  here; I am talking a matter of trivial fact.  The INQUIRE
  statement as shown doesn't return anything at all.

24. I was serious about the problem of spelling environment

  I specifically recommended that the name GET_ENVIRONMENT not be
  used because of the difficulty that many people have in
  spelling environment.  It was a serious comment.  I could
  accept this name anyway, but it is nice to see my
  recommendation given its usual careful consideration.

25. Input string length argument

  I gave what I thought were some cogent arguments for using
  an optional logical argument TRIM_LEN to specify whether or
  not trailing blanks were trimmed from the name in my
  get_system_variable routine.

  I see that paper 201 ignored this recommendation and stuck
  closer to the posix form here.  Similarity to posix would at
  least be a plausible argument, I admit, though I thought the
  balance lay on the other side here.  Again, the courtesy of at
  least invoking the word posix or something as a justification
  might have made this more palatable instead of just ignoring
  my recommendation without comment.


Ok, I'm about out of steam.  (Do I hear sighs of relief?)  That's
at least most of it.  So I count about 25 ways in which paper
153 was ignored and one thing from it that was adopted.  I think
it ought to be sufficient to make my point.