9/14/04 Meeting Minutes (Taking minutes: Sun) y HP Jim Hamrick (jh) Y HP Jay Rosser (jr) Y HP Fred Worley (fw) Y IBM Fredy Neeser (fn) n NetApp Arkady Kanevksy (ak) Y Sun Matt Pearson (mp) cascading ascii art attendance diagram. (if you have more than 1 minus visible, you are not eligible to vote.) hp ibm netapp sun -- --- ------ --- m-3 + + - + m-2 + + - + m-1 + + - + m-0 + + - + New AI's AI fn - look into proposal for a clean way to expose different argument lists. AI All - see if there is any problem with using polymorphic functions to implement the new functionality, or if new function names are going to be required instead. AI fw - write a more detailed description of interoperability errors, handling, exposure, regarding CM. Agenda bashing, approve minutes Minutes to approve: Minutes from 8/31/04 yet to be sent to reflector Action item review JR - publish details of proposal for having Consumer send MPA REQ/REP frames explicitly to reflector. jr sent out an email on 8 sept "support of ULP with MPA req sent in same direction as final streaming mode message". gist of proposal is to push requirement of req/rep processing onto consumer, to make it not particularly easy to support this model (underlying agenda is we don't want to encourage this, but we don't want to forbid it either). call it_socket_convert() with NO_REQ_REP flag: tells impl. to not send startup messages. gives rough equivalent of an rdmac rnic. gives consumer ability to consume them in their own ULP and a description of how to do it. jr won't go into gory details - see email for more info. one highlevel point is consumer would need finer control of marker settings on the rnic in this case, as he is essentially behaving as an rnic here. fn so based on the figure you sent, we cannot keep the consumer from doing this - he can turn mpa req/rep off at the convert call, and can basically send his own mpa frames. but the mpa reply would now really be the final streaming mode message from the consumer perspective? jr yes. fn if this is the case the message would be sent with the convert call? jr yes. in the email text of the proposal i have descriptions I1-3 and T1-3. you have described step T3. fn so your concern is that the consumer, in order to fill in the rep properly, he needs to know about the rnic's marker preference, for instance. jr we have surfaced this already. he'd also need the capability to individually control send and receive markers. fn what does this mean for the itapi implementation? it gets the convert request, then modifies the qp to rts, then... [ . . . ] fn the marker preference could be processed from the rep message? jr the api would need a hint to see that was the consumer's intent. fn i see the issue of exposing these details in the convert call. jr we don't want to encourage this, but goal is to not prohibit it. something to think about. we could just add additional flags, and that would meet the letter of this requirement - wouldn't be that hard. but it is not very elegant. fn what would be the semantics of those flags in the normal case? jr could make them "special case flags" that you would not normally use. we already have a set of flags, a flag to disable req/rep processing and a flag to disable markers on receive. could just add one more flag, disable tx marker processing. jr we could make this extra flag only functional if no_req_rep was turned on. kind of a backwards/confusing suggestion, but it would only be in that mode that you'd be able to alter the tx flag. this could be confusing. fn i see, only in that case does it make sense to do this. but then of course we need a remark that it is only intended for this special case. jr and point out that turning off req/rep mode is meant for rdmac mode? mp is this something that needs to be done in the first draft of the api? fn we could move it out to phase 3. jr like the sound of that - this was your original concern, jim? jh don't have a specific ulp i could point to that needs this. could support delaying it if timing is important? jr stage 1 of phase 2, or defer to stage 2 of phase 2, or delay it a long time to phase 3. [ general discussion ] jr so preference is stage 2 of phase 2? [ general agreement ] jr haven't touched on caitlin's response to this. her use model is you have mpa req/rep and don't use a last streaming mode message at all. feels there is no such requirement. i may not be interpreting her response correctly, but, if i am, i do not understand her argument. the very premise of it_socket_convert() is you are using a socket in streaming mode and want to use your own ulp before switching. by definition there must be a last streaming mode message. fn think she was alluding to the it_ep_connect() case and trying to use convert call to emulate our TII. fw why? fn one argument is, you might prefer to establish the connection first under tight control of application, allowing you to do some security checks, etc. that you might not be able to do if the itapi impl. does it for you. jr sure. if security checks are explicitly set on port numbers against DoS, they need to be constrained not to use streaming mode data? fw do you mean things like enabling IPsec? fn i think she alluded to firewall rules that one might want to impose and it might be easier to do that if you have control over the connection from the application in the accept case from the passive side, and only then would you go into the convert case. fw interesting point, how to do in the ep_connect state, we don't really expose the socket descriptor, we don't really provide an interface for that - but then, that's not really the point of it_ep_connect. i wonder if there's an issue there? the point of ep connect is to take advantage of rdma regardless of the wire, and this issue is very specific to ethernet as a wire. fw wonder if it's even appropriate to handle this in the api or whether it's the osv should be handling it. it might be easier for the programmer to maintain the lack of need of explicit control over the connection, by allowing the concerns you discuss to be handled by the OSV and not the api. from my perspective and i'm an application programmer and i'm trying to write a portable app to an rdma interface, i'd sure prefer to use it_ep_connect - if i use socket convert i have to do a lot of work and understand the feature set of my hardware. fw basic issue is this. we expose the socket attributes with the convert functionality. specific consumers may want this, and the price they need to pay is sending a streaming mode message. if this is a pedal to the metal api is that an ok restriction? fw if we determine later in phase 3 or tomorrow that there is a really solid use case here, changing it would not require breaking of compatibility - we could add another flag to determine whether there can be a null LSM - no arguments would change. jr possible change to state diagrams? fw issue is effects of not pursuing this now. it may be work to change it later, it may be a lot - but there is no risk of breaking backwards compatibility here. we'd be expanding the functionality not restricting it. fn could add this later on without breaking things for existing model. merely adding a flag is no big deal. bigger problem is maintaining consistency with the state machine. that is a bigger risk and perhaps we should not take that risk now. fw agree completely. fn ep state machine is not affected, it's the qp state machine [ could have this backwards ] JR - add new IA attribute to detailed requirements describing bit that tells Consumer whether or not they have to post a Receive operation before they can successfully complete a connection. jr added ia attribute to detailed requirements, it-iwarp-extended-qp-state-machine or some such name (paraphrasing) there is some discussion in the ballot. FN - create additional text for Global Behaviors section to clarify what an asynchronous call is in the IT-API. [ see above ] FN - send JR a note summarizing what needs to be changed here (reduce to only 7.0.0.2). jr did we address this? fn yes, had some email exchange on this, it is covered. MP - send email to the reflector describing the modifications/additions to the IT-API that will be required if the existing MM detailed requirements are ratified. jr looks like fredy covered this fn just an attempt to show 2 ideas plus a third, new one on how to do this. colleagues felt both ideas discussed last week were ugly - particularly putting new arguments at the end because the order of the arguments would be unnatural. also: problems with type-checking from an ellipsis. led me then to the third idea to provide a workaround. would put a suffix 1 at the end of the old bindings (existing calls) for backwards compatibility. main reason to use extended names is to encourage applications to use new bindings. but application writers could do a search-and-replace for a quick update. jh issue is you're creating a source code compatibility problem to prevent one in future? the only justification i can see for this is if you can see we'll have other source code compatibility problems to access this functionality. in other words, will old applications have other source code problems that require editing? think this could be the only one. fn window they create will be wide on iWARP, so all 1.0 applications need to edit to run on them. they need to think about this. if they don't care about iwarp, perhaps they could use #defines to go with version 1 calls. for someone who really wants to update his application to be future proof he better insert the additional arguments in an appropriate way. jr tough situation. another thing the 2.0 consumer needs to be aware of is the cm use model and that could be a much more severe code change than this, which is just a global replace. we now require posted sends and posted receives. fn much more significant than handling these four calls. jr trying to class who would be hurt by this approach. an existing IB application that wants to work with ITAPI2.0 on IB1.1? jh you could argue, why would they want to port to 2.0 then? fn verbs extensions? jr would imply they are moving to IB1.2. might be literally required to change their hw in that case, big change. mp my thought is we need to think about other changes to the api besides this. we have three proposals none of which are perfect. fn first solution to use 2() calls, this tends to create definitive bindings. once applications use the 2 bindings, why would they want to change? mp i was thinking the 2 would be there permanently. fw hate to even suggest this - what would be the implications of essentially doing both, changing the name of both functions, so there is a bind1 and a bind2, and provide a set of #defines that map to whichever you prefer, and you could include one header or the other as you like. this would preserve code purity but gives a lot of rope to hang yourself. mp you do request what version you want when you open an ia? could have different bindings exposed when you request 1.0 or 2.0 - oh, but this is a runtime check not a compile time check. could slow things down in bind, throw in extra instructions. fw extra instructions not a problem for create calls. mp right, already a ctx switch there. issue is for link - but link and bind are separate already - need to move to link to support narrow rmrs... fn if we want to distinguish between bindings we need to do it at compile time. fw not sure if we could do compile time checks that would avoid the problems listed in fredy's email. fn could have a single header file, set a variable, if you want to use new features don't define this compile time constant, but if you did, you'd get new features. jh think you want to do it the other way round, but i see. doesn't seem like much of a burden. jr that covers compile time, how about run time? jh run time not an issue for bind/link AI fredy to look into proposal for a clean way to expose different argument lists. AI All - see if there is any problem with using polymorphic functions to implement the new functionality, or if new function names are going to be required instead. IT-API CM requirements for RNIC-PI WG jr have done no work on this. we should do this but i have made no progress. fn wrote a list of a few items i think we should include. sent it right before meeting but only to jay. can send to full reflector. fn felt that sending detailed requirements was probably a bit much. a high level discussion, including ladder diagrams, should be a good introduction. jr and many of these we already have, so it shouldn't be too painful to generate this. fn can you cover application level issues, which you have focused on? fn i will generate a draft tomorrow and send it to the reflector. Taking meeting minutes jr with mp departing we have two companies that can take minutes. per company, this is difficult. one proposal is to start on a per-person basis. [ no objection ] fn jay and i will try to share the cochairing role, and if i am by myself it is not easy to take minutes. if you guys have more resources on this i really appreciate that i don't have to do this every second time. Outstanding ballots to vote in telecon: MM Email from Matt Pearson, subject "Narrow RMR Support: Ballot (resend: amended)", sent 9/2/04 7:47AM PT iWARP CM Email from Jay Rosser, subject "iWARP CM Support: Ballot", sent 9/9/04 11:56AM PT. jr is anyone inclined to change their vote from what they posted to the reflector? [ silence ] then both are so approved. fw quite a milestone, as this represents detailed requirements for all issues in part one of phase 2. iWARP CM issues Email thread started by Jay Rosser, subject "updated CM requirements draft", sent 9/7/04 9:34PM PT jr many of these have been folded into the ballot we already voted on, but i have not carefully looked through this thread to see if any other issues remain to be discussed. jr one issue is interoperability terminology. we have a common understanding but the issue is do the requirements as written reflect that? conclusion is that the requirements as written can be interpreted so that they do. fn real issue is to craft careful words in man pages and possible appendix? fw one issue you raised in mail that i didn't respond to is whether it is appropriate to surface an error in a noninteroperable situation - fredy said it was ok but not required to do so. fn one example is disabling mpa req/reply and disabling markers on receive. fw want to be careful here - that would be silly, but it might work. rnics that are capable of doing ietf crc and framing suppression, but also have the capability to disable mpa req/rep - they could do this. fn what about remote side? fw [ ] fw my point is if i am a real customer and i have a real implementation of iser. if they buy my iser and someone else's iser target, and they happened to purchase incompatible remote and local sides - the hardware will always go to iSCSI and never make it to iSER. need to surface an error indicating incompatible hardware so that the sysadmin can determine it is hw incompatibility and not a sw bug. fn what do you need for this? fw possible we need a new error type - interoperability error or incompatibility - suspect it is not immediate. fw if the initiator is an ietf compliant device using itapi, but the device itself does not support the feature we recommend in the requirements (that the impl allow the consumer to defeat req/rep) - we changed requirement to say "a device shall" to "a device should" allow you to defeat req/rep. we want to make sure itapi runs on the shalls and the shoulds. it won't work as well on the shalls but it should work. fn always thought the itapi could always control these. jr fred wants to ensure that devices that do req/rep in hardware can be supported by itapi. fw for rdmac this is easy, since we have to do it in sw. ietf does not allow for suppression of req/rep. so that gives ihv's latitude to implement it in such a way we can't turn it off, but we would like to. fn hope is that ihv's will use rnic-pi, but there might be an ihv that can't wait. fw still want to support these devices. want to strongly encourage the feature, and expect set of devices that can't disable this to be small and over time become 0, but, we want to cast a wide net. Email thread started by Jay Rosser, subject "Support of ULP with MPA REQ sent in same direction as final streaming mode message", sent 9/8/04 5:06PM PT fw issue here is if both sides send a 0 length message, what happens? will this surface an error? think it will simply hang. fw there are some cases where we can surface a specific error. rdmac initiator, must send mpa frames - target is ietf without ability to receive mpa frames. another case is initiator cannot defeat req/rep, target is rdmac and cannot fake req/rep. not sure in latter case how to surface an error. also not sure which errors we should surface. AI fw to write a more detailed description of interoperability errors, handling, exposure. fn how can there be a device that can't support framing? fw devices that do not have enough space to reassemble buffers may require streaming, expect the set of non-framing devices will be quite small, but we need to be prepared. fw going back to discussion of wording and interpretation, do we want to use the same terminology in the man pages as in the detailed requirements? jr and how/where should we expose interoperability concerns? think we need a separate section, for starters. jr it is a specific detailed requirement that we discuss interoperability for the TD case. fw but there are cases in TI that are not interoperable. jr but TI requires both sides to be ITAPI, we require a ulp here, so that assumes interoperability? markers must be on, framing plus our protocol must be used. fw need to verify this to make sure we say this. fn need to make sure that if dapl works in this area with a ulp of their own, that we can interoperate with them. fw fortunately we have a simple ulp, we adopt the ietf model with a simple addition of when the stream transitions. it is the most straightforward and obvious ulp based on ietf. provided we codify that in a public place and cross our fingers, we should be okay there. fw if we are allowing the use of markers to be negotiated and the peer is a non itapi but with a compatible protocol, and that non itapi does everything we have mandated except that they do not require the use of markers, then that does allow a device where an ietf peer with an rdmac compliant device and a non itapi peer with an ietf compliant device would not interoperate if that ietf compliant peer is not able to handle framing. jr that case cannot be resolved. fw however is it possible to surface an interoperability error using the TI api? fn not sure i understand markers here. we allow that TI interface negotiates markers. so what can go wrong? can only affect receive direction. jr but it's being asked to defeat sending markers which not all rdmac devices can support. fw ietf device says, i cannot receive markers. rdmac device says i cannot disable markers. this is an interoperability error for both TD and TI interfaces. need to expose interoperability error in TI case. now we are moving to the case of what if the other side isn't us? jr do we currently recognize this in the detailed requirements? yes, we do. fw is the requirement only TD or is TI as well? see 1.2.4.2.5. jr requirement i had in mind 7.0.2.3.2.2.2. fw crux of the matter is there should be nothing to prevent the TD case from interoperability with non itapi consumers if the hardware supports it and the ulp we require is supported. jr we do say that interoperability is a TD feature ONLY. and our error requirements are not explicit - we say an error is returned, not what type. fn how about we go through the two cases you have described in TD and TI cases and see if we can detect distinguish connection failures? fn suppose the remote side requests markers off and the host side can't support it, then the remote side sets the reject bit. how can the local side interpret this reject bit? fw it wouldn't necessarily interpret the reject bit but there is a case where you are an rdmac device and you receive an mpa req from your peer saying i don't want framing on. you can't do that. so you reject that, and you surface an error. jr is there a defined wire error for that? fw i can notify my consumer locally regardless of whether i can notify my remote peer. there are other issues where this is not so clean. jr and then there is the meta issue of do you want to do this? fw take the AI or support this kind of detection? fw so to summarize: we are considering surfacing an error if we can detect an interoperability error, we will think about what cases would cause this, we'll think about where in the man pages we should discuss this, and we'll think about the wording we'll use. the wording may not match the wording in the detailed requirements. (particularly the terms for the types of devices and how they may or may not interoperate.) Next steps Focus on man page generation (assuming req'ts voted in), occupy spare time in telecons with errata review, next round of detailed requirements to be prioritized. fn i will work on narrow bind man pages and send a draft to the reflector. jr will do the same for connection establishment. [ meeting adjourned at 3pm EDT ]