J3 / 97-203 Date: 30 July 1997 To: J3 From: R. Maine Subject: Command line argument argument rehash A straw vote on the general direction of the command-line argument MTE was taken at meeting 141. That vote records 14 in favor of the model of paper 97-153 ("POSIX" model), 4 in favor of the model of papers 97-151 et al (everything-at-once model), 0 in favor of neither model and 1 undecided. Based on this, I had expected the next paper to be "along the general line" of paper 97-153, with some fine tuning. Paper 97-201 seems to completely miss this mark. There were many differences between the approaches of papers 151 and 153, not just one. Thats the whole reason I wrote paper 153 in the first place - I felt there to be too many major differences to treat by debating them one at a time. I wanted instead to provide something that I considered to be at least reasonably close so that there would only be a few relatively small fine points to debate and fine tune. As best as I can tell, paper 97-201 adopts one and only one of the ideas from paper 153, leaving everything else as in paper 151. In at least one case, it goes even further by adopting an idea that was advocated in neither paper 151 nor 153, but was mentioned as a "straw man" in 153 as something that would be "unreasonable" an explanation for why it made a different choice. This really does not seem like a "fine tuning" along the general direction of paper 153. It looks more like paper 151 with one token change as a concession to the vote. I cannot regard this as an "impartial" attempt to work along the direction of 153. The introduction and the history sections of paper 97-201 almost admit this in their categorization of paper 153 and of the vote taken. As the author of paper 153, I completely disagree with its categorization as "the Iterative model". I never used that terminology in the paper; nor was that terminology used in the straw vote or in the minutes. The description captured in the minutes of the straw vote ("POSIX" model) more accurately captures its flavor of being similar to the POSIX binding, and having lots of differences from paper 151. (I note that paper 97-201 also incorrectly describes option 3 in the straw vote; that option was for neither approach, which is not at all the same as for having no feature at all; but since that option got no votes, the only relevance is in illustrating how much paper 97-201 misses the point of that vote). I propose that paper 97-201 just be dropped and that paper 97-153 be used as the basis for further work on this item. That doesn't mean that 97-153 is the final word, but only that further work should pay at least reasonable attention to its direction and fine tune a few points instead of throwing out almost the whole thing. Specific points of paper 97-201 that I disagree with follow. I'm not going to try to give full arguments on all of these points; see paper 97-153 for its explanations of the choices it made. I thought we already had that vote. The object here is just to point out in how many ways paper 97-201 follows the direction of 151 instead of 153. 1. Excessive reference to the command line model. Paper 153 specifically stated that it was avoiding the prejudicial use of the term "command line", adopting the more neutral term "system argument" that is equally meaningful interpreted as a command line argument or as some other mechanism. Several e-mail comments from users were specifically supportive of this approach. Paper 201 (like 151) uses the prejudicial terminology both in its intrinsic names and in its description, and then tries to "recover" by saying that the system doesn't actually have to do it that way. I think this far inferior (and tricky to word so that you don't just contradict yourself - I don't look forward to getting that wording right as editor). 2. Definitions of operating system terms. Paper 201 (like 151) proposes that the Fortran standard define the terms "operating system", "command line", and "command tail". I think this a bad idea. 3. Discussion of delimitters. This is perhaps an offshoot of item 1. Paper 201 (like 151) talks a lot about delimitters. I don't think the word should even come up. 4. Operating system dependence versus processor dependence. Paper 201 (like 151) proposes that several things be "defined by the operating system, or processor dependent if there is no operating system definition." This concept seems bizzare to me. If it is processor dependent, just say so. I don't think a distinction between operating system defined and processor dependent belongs in the standard. 5. Multiple character kinds Paper 201 (like 151) requires that the compiler support all character kinds for the character arguments. Paper 153 requires only support for default character kind, which I still feel to be the best approach. 6. Multiple integer kinds Paper 201 (like 151) requires that the compiler support all integer kinds for the integer arguments. paper 153 requires only support for default integer kind, which I still feel to be the best approach. It does not seem plausible that any of these values are likely to exceed the range of default integer. 7. Ill-defined character conversion. As part of its requirement for supporting multiple character kinds, paper 201 says "If the KIND type or character set of the target variable cannot represent the assigned value, an error occurs." I have multiple objections to this. Sorted from the most trivial wording problems to the most important conceptual issues: 1) The term "target" is inappropriate here. 2) If "KIND type or character set" is supposed to mean something other than just "type and kind", then I sure don't know what, but I probably don't like whatever new concept it is. 3) I don't know whether "an error occurs" means a compilation error (which is what you get elsewhere with type and kind mismatches), a run-time abort, the status variable returns an error code, or that the code is illegal so anything can happen. 4) Fortran has no concept of conversion between character kinds and this is far to obscure a place to be introducing such a major concept. And I'm sorry, but "ability to adapt to the likely consequences of the ISO/SC22 mandate for internationalization" doesn't cut it. I see no evidence that this is related to any "likely consequence." And even if it is, it will be much better to make such changes in an integrated manner instead of one intrinsic at a time. Such a major concept deserves separate discussion and treatment as a major feature on its own - not a back-door introduction as a minor part of an MTE. 8. Naming conventions from the 60's Paper 201 (like 151) speciously prepends many of its argument names with the letter "I". Looks a lot like I used to code in the 60s. It is inconsistent with usage in simillar situations elsewhere in the standard. 9. Lack of module packaging Paper 151 said little or nothing about namespace pollution. Paper 153 suggested packaging these intrinsics in an intrinsic module, following the lead of the exception handling TR and the ISO_VARYING_STRING module. Several comments recieved were supportive of this concept. Paper 201 doesn't even mention why it was dropped, just ignoring the idea. 9. Usage of "ISO_" in intrinsic names In the process of discussing the concept of module packaging, paper 153 said "it would be unreasonable to require all new intrinsics to have names starting with ISO_". Paper 201, in an apparent effort to disagree with 153 to the maximum extent possible, does exactly that, even though paper 151 hadn't. 10. No portable error detection. Paper 201 (like 151) provides no portable means of detecting whether an error has occurred. Specifically, it does not define a status code value of zero to mean sucess. (Any other system-independent value would also do, but zero seems obvious and consistent with other areas of the standard) When I first noticed this in paper 151 (or perhaps its predecessor), I thought it must have been a trivial accidental omission. It was beyond my meager powers of imagination to conceive that anyone would do this intentionally and would actively object to making it portable. Subsequent discussion revealed the inadequacies of my imagination. Another of my inadequacies is that I cannot come up with the words to convey how bad an idea I think this is. 11. System_argument_count as a subroutine instead of a function Its true that paper 153 did say that I could see arguments on both sides of this question, but it came down on the side of a function by giving the "deciding vote" to similarity with Posix 1003.9. I still think that's the overall best choice. Admitedly the arguments are not overwhelming. If this were one of only a few changes to the proposal in paper 153, I could accept it as "refinement" (assuming, of course, that a majority agreed that it was). As part of a paper that ignores almost everything in paper 153, it doesn't do so well. Paper 201 had no actual discussion of why this was thought to be better. I would expect some such discussion of "refinements." 12. Noun form as subroutine name Although plenty of reasonable exceptions exist, I take it as a starting guideline that subroutine names should typically be verb forms and function names should be noun forms. The name ISO_NCOMMAND_ARGUMENTS as a subroutine name violates this principle, and I don't think this is one of the good violations. If this is going to be a subroutine, it should have a verb form (likely GET_something_or_other). Of course, I still think it better as a function, so maybe that aspect of the name is ok after all. 13. Missing underscore in subroutine name. I can understand abbreviating NUMBER to N (I do that one a lot myself), but my poor English-trained eyes have trouble seeing the N as anything but a typo when it is tacked onto the from of COMMAND with no delimitter. Lets see, that's the 5th unrelated thing I dislike about this intrinsic name, without even getting into functionality: prepending with ISO_, use of the prejudicial term COMAND, wrong grammatical form for a subroutine, missing underscore, and I have a slight preference for the *_COUNT form over the N_* form. The last is a fairly minor preference. My first choice is still something like SYSTEM_ARGUMENT_COUNT, but I could go with something more like N_SYSTEM_ARGUMENTS (both as functions, of course). 14. N instead of NUMBER as argument of get-system_argument I slightly prefer NUMBER instead of N because the other arguments are spelled out (LENGTH instead of L or LEN, etc.) This will probably be specified positionally most of the time anyway. Debatable, but again, this doesn't debate it - just changes it without note. 15. Why isn't VALUE optional in get_system_argument? I specifically made VALUE optional. Its not a big deal, but why not. The optionality probably wouldn't be used a lot, but I can at least imagine a way to use it. Is there some specific reason for this change or just part of a general rejection of everything from paper 153? 16. Poorly expressed semantics of length in get_system_argument After some study, I finally concluded that paper 153 probably intends the same semantics as I do for the lenth argument, but its expression is sure confusing (enough so that even the author deems an explanatory note necessary). I think my words are a lot closer to useable as edits. Also, I think it important to explicitly mention, as my paper did, but 201 doesn't, what happens to the length argument if the string is too long for the value argument. I think paper 201 intends the same thing, but I'd rather it be explicit. 17. Ignored posix semantics for invalid argument number My paper explicitly specified that an invalid argument number results in an error status code return. I did this because I was told that this is how posix handled such an error. Paper 201 takes the approach of not bothering to mention the issue at all. It niether restricts the value of N nor specifies what happens if its value is "nonsensical." 18, Included an unparsed_command_line subroutine Paper 153 gave arguments for why this capability should be left off. At least a few other people have agreed with these arguments (including at least one or two that initially thought otherwised but changed their minds after considering the reasons I gave). I'm not hard-over on this one, but paper 201 fails to give even token response to my reasoning. In fact, it illustrates the point by even explicitly suggesting one of the flaws I warned about. The analogy that paper 201 makes to date_and_time is flawed and illustrates that it fails to understand the fundamental issue. With date_and_time, the system either has a clock or it doesn't. There are two forms for retrieving the information, but you are guaranteed that it is available in either both forms if it is available at all (at least that's my reading - anyone disagree?) Paper 201 specifically proposes that the system can independently choose whether to make command line data available via the individual arguments or via the unparsed command line. (Anyway, it explicitly says that the processor could choose to return individual arguments, but return blanks for the unparsed command line; I think it is implied that the converse would also be allowed). Therefore, because the processor has so much choice, the programmer has none. To write portable code in such a situation, the programmer would be *REQUIRED* to write the code to handle *BOTH* forms of retrieval. He wouldn't get to choose the most convenient one (as he can with date_and_time). And, of course, this difficulty in writing portable code is exacerbated by the inability to portably tell whether a call succeeded or not (item 10 above). If we are going to include this capability, lets give at least some attention to portability. Perhaps by requiring the processor to return the information in either both forms or neither (as we do in date_and_time). 19. Poor name choice for "unparsed_command_line" If the definition in paper 201 is retained, it is clear that the so-called unparsed command line might be quite heavily parsed indeed, and possibly otherwise manipulated and then stuck back together. For example, the command line myprog ~maine/*.dat might well end up "unparsed" as /home/maine/file1.dat /home/maine/junk.dat /home/maine/stuff.dat 20. Separating command name from command "tail" Paper 153 suggested that if a routine to retrieve the whole command line were done that it should not attempt to separate the command name from the tail. Paper 201, of course, goes the other way and helps illustrate why you don't want to do this. Aside from the difficulty of defining the tail in a system-independent manner, paper 201 is going to confuse the user that skims it. A name like get_command_line already implies the whole command line (it doesn't say get_command_tail does it?). Adding on "unparsed" only serves to emphasize that it really is the whole thing, because separating the name from the tail would be part of the parsing. 21. Get_program_name unnecessary I don't think this intrinsic is necessary at all. Which is a good thing, because it is pretty ill defined. 22. Difference between name and value in get_program name I don't understand this difference. I certainly don't think it is well-defined. It uses all kinds of concepts that Fortran doesn't define, and some operating systems might not either. I'm not aware that there was necessarily such a thing as a "file name of the program that is executing." I don't care to get into a discussion of what such a thing might or might not be. 23. Trivial factual error repeated the 4th time. I know that my comments are being ignored when I see things like this. In every version of Craig's paper he keeps referring to "the value which would be returned by an INQUIRE(FILE=file-name-expr) statement". This is complete and utter nonsense. I am not talking about a matter of judgement here; I am talking a matter of trivial fact. The INQUIRE statement as shown doesn't return anything at all. 24. I was serious about the problem of spelling environment I specifically recommended that the name GET_ENVIRONMENT not be used because of the difficulty that many people have in spelling environment. It was a serious comment. I could accept this name anyway, but it is nice to see my recommendation given its usual careful consideration. 25. Input string length argument I gave what I thought were some cogent arguments for using an optional logical argument TRIM_LEN to specify whether or not trailing blanks were trimmed from the name in my get_system_variable routine. I see that paper 201 ignored this recommendation and stuck closer to the posix form here. Similarity to posix would at least be a plausible argument, I admit, though I thought the balance lay on the other side here. Again, the courtesy of at least invoking the word posix or something as a justification might have made this more palatable instead of just ignoring my recommendation without comment. Ok, I'm about out of steam. (Do I hear sighs of relief?) That's at least most of it. So I count about 25 ways in which paper 153 was ignored and one thing from it that was adopted. I think it ought to be sufficient to make my point.