Welcome to this EMC Support Community Ask the Expert conversation.
This is an opportunity to learn about and discuss:
What is AIMA?
What is the design philosophy of AIMA, or, in other words, how should I configure it to be successful?
Some common deployment scenarios with best practices and pitfalls.
This discussion begins on 9-16-2013 and concludes on 9-27-2013. Get ready by bookmarking this page or signing up to receive email notifications.
Your host(s): Tim Wright
|Occupation:||Technical Support Engineer V|
|Biography:||I have a degree in Computer Science with Mathematics from the University of Bristol, England.|
Subsequently, I have spent over 25 years working on complex clustered systems in Customer Support, Professional Services, QA, and Engineering at various companies. I have performed kernel development on both Unix (DYNIX/ptx) and Linux.Because of my breadth and depth of knowledge and experience, I am frequently called upon to diagnose and solve the most complex and difficult problems that involve multiple disparate areas of the product and environment.Outside of work, I enjoy food and wine (both eating out and cooking/entertaining), playing with technology and music (singing and musical typesetting).
|Expertise:||Extremely broad and deep knowledge of Unix and Linux. Kernel-level development experience with both.|
Considerable knowledge and experience in Microsoft Windows architecture and programming.
Significant VMware experience, including VSP and VTSP certification.
Deep knowledge of IP networking, InfiniBand, storage, SANs, Intel-architecture hardware etc.
Programming languages known include C, Python, Perl, Shell, and many others.
Excellent customer skills in support and also pre-sales.
|Company:||EMC Isilon Storage Division|
|Display Name:||Tim Wright|
This Ask the Expert event is now open for comments and questions. We look forward to an interesting and informative discussion!
I'm glad you asked that. If you can give me about 15 minutes or so, I am about to post an intro which will, I hope, answer that question and possibly others and maybe provoke further questions.
Let me start by doing a very brief write-up on the subject:
Q: What is AIMA?
A: AIMA stands for Authentication, Identity Management, and Authorization. Taken together, it is the functionality in OneFS that talks to the various authentication sources, combines the different personas for a user into an access token, and mediates access to files and directories. Most of the work is performed by the lsassd or lsass process.
Q: Why should I care about this?
A: AIMA is fundamental to correct operation of the cluster. If your auth environment is simple, then configuring AIMA is correspondingly simple. Where it gets interesting is in the area of multiprotocol.
OneFS supports multiprotocol access i.e. access to the filesystem from both Windows clients that use ACLs and SIDs and also from NFS clients that use POSIX permissions and uids/gids. When correctly configured, it allows users that exist in both "domains" (I'm not referring to AD here) to seamlessly access their data.
AIMA has the concept of auth providers which are directory services where users can be looked up. These include Active Directory, LDAP, NIS as well as cluster-local users. It is probably worth mentioning a few of the design tenets here because they determine how to best configure AIMA.
The underlying reason for the assymetry of the default mode is to accommodate NFSv3. If you are using NFS and you are not using Kerberos, the NFS isn't actually doing any authentication. The protocol simply passes numeric uids and gids in the RPCs on the wire. Given this, if these client are to have access to data, then those uids and gids must somehow allow them access. Because the default settings ensure that Windows users who also exist in a Unix auth provider will store the unix identities, that requirement is met.
One of the challenges to implementing multiprotocol access is the lack of common identities in the on-the-wire protocols. SMB only understands SIDs and ACLs. Basic NFSv3 only understands POSIX permissions and uids/gids (v4 adds support for its own ACL format). Another function of AIMA is to perform this translation. For POSIX ids that do not have an equivalent Windows user, SIDs from special ranges (S-1-5-21 and S-1-5-22) are returned. For SIDs that do not have an equivalent POSIX identity, AIMA constructs "fake" uids/gids and remembers the mapping. This allows e.g. and NFS user to perform stat() (e.g. ls) on a file that is owned by a user that only exists in Windows and get a sane answer.
That's probably enough to start. I will also post some shorter articles focusing on particular tips, tricks and "gotchas".
It seems that lsass(d) in 6.5 never caches queries from an LDAP server.
Run (on the cluster) stat on 50 files, and 200 full queries, with one new connection per query,
are sent to the LDAP server within a second or so -- even if only a few distinct
user/group names appear listed as file owners.
Sporadicly our cluster (and others have reported the same here)
complains about the LDAP server accepting no (more) connections.
Quite understandable with such flooding.
How about better caching? Any options, or am I missing something?
Are there improvements in 7.0?
Or is id caching regarded as a problem when looking at the whole cluster?
Could be possible, but not fully obvious. Considering that queries to the
LDAP server are asynchronous (from multiple nodes) anyway,
a few seconds shouldn't do any harm I think; and would help a lot in
the above example.
you are quite correct. The LDAP provider in 6.5.x does not cache lookups. Happily, I can report that lsass performs caching for both NIS and LDAP in OneFS 7.x in addition to the caching it has always performed for Active Directory.
By default, the id cache size is 4.77MB and can be modified if required:
# isi zone zones modify <zone> --cache-size=<size>
The cache lifetime for both NIS and for LDAP is 15 minutes by default. The duration can also be modified using:
isi auth ldap modify <provider> --cache-entry-expiry=<duration>
Your latter point is also important. As implemented today, each node has its own instance of lsass, and its own cache etc. The idmap database for mappings is shared, but there is no globally coherent cache. That obviously means that if you have a large cluster, each node will be performing lookups to the auth providers. Given that in general, a user will connect to only one node at a time, and that the lookups are cached, this should not cause a dramatic increase in traffic, but I do plan to talk to Engineering about where it would be feasible to have some degree of collaboration between the nodes. This would not be trivial to achieve but it certainly warrants consideration.>
sorry for the lack of clarity there. Yes, by 7.x I mean 7.0.1.x or 7.0.2.x or 7.1.x when it ships.
Current GA versions are, I believe 184.108.40.206 for the 7.0.1 maintenance release train and 220.127.116.11 for the 7.0.2 train. The caching was added in 7.0 and will exist in all future versions.