问题现象: 在银河麒麟V10上执行ansible时, 遇到了一个报错问题: ansible_pkg_mgr变量没有定义, 遂写了一个test.yml测试一下:
|
- hosts: localhost tasks: - debug: "msg={{ ansible_distribution}}" - debug: "msg={{ ansible_pkg_mgr }}" |
测试ansible_distribution为redhat, 而ansible_pkg_mgr的确是没有取出来 问题分析: 根据报错,很明确是因为ansible无法自动判断出系统使用的yum版本导致,我们知道当ansible中yum模块不指定use_backend参数时,将尝试自动判断,而ansible的setup模块可以获取对应的必要信息, 其中一个变量ansible_pkg_mgr及对应yum后端模块,接下来我们执行setup模块输出ansible_pkg_mgr变量来验证下我们的判断:
|
# ansible -i hosts node01 -m setup -a "filter=ansible_pkg_mgr" node01 | SUCCESS => { "ansible_facts": { "discovered_interpreter_python": "/usr/bin/python" }, "changed": false } |
果然没有办法获取到ansible_pkg_mgr变量,先看下系统版本信息:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
|
(ansible) [root@BIP-YMS-OAS resources]# cat /etc/os-releaseNAME="Kylin Linux Advanced Server" VERSION="V10 (Tercel)" ID="kylin" VERSION_ID="V10" PRETTY_NAME="Kylin Linux Advanced Server V10 (Tercel) ANSI_COLOR="9:31" (ansible) [root@BIP-YMS-QAS resources]# cat /etc/os-release NAME="Kylin Linux Advanced Server" VERSION="V10 (Tercel)" ID="kylin" VERSION_ID="V10" PRETTY_NAME="Kylin Linux Advanced Server V10 (Tercel) ANSI_COLOR="0:31" (ansible) [root@BIP-YMS-OAS resources]# uname -a Linux BIP-YMS-QAS 4,19.90-23.8,v2101.ky10.x86_64 #1 SMP Mon May 17 17:88:34 CT 2021 X86 64 X86 64 X86-64 GNU/Linux |
接下来根据报错提示信息找到ansible相关代码,在yum.py中,相关代码如下: ansible/plugins/action/yum.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
|
if module not in ["yum", "yum4", "dnf"]: facts = self._execute_module(module_name="setup", module_args=dict(filter="ansible_pkg_mgr", gather_subset="!all"), task_vars=task_vars) display.debug("Facts %s" % facts) module = facts.get("ansible_facts", {}).get("ansible_pkg_mgr", "auto") if (not self._task.delegate_to or self._task.delegate_facts) and module != 'auto': result['ansible_facts'] = {'pkg_mgr': module} if module != "auto": if module == "yum4": module = "dnf" if module not in self._shared_loader_obj.module_loader: result.update({'failed': True, 'msg': "Could not find a yum module backend for %s." % module}) else: # run either the yum (yum3) or dnf (yum4) backend module new_module_args = self._task.args.copy() if 'use_backend' in new_module_args: del new_module_args['use_backend'] display.vvvv("Running %s as the backend for the yum action plugin" % module) result.update(self._execute_module(module_name=module, module_args=new_module_args, task_vars=task_vars, wrap_async=self._task.async_val)) # Now fall through to cleanup else: result.update( { 'failed': True, 'msg': ("Could not detect which major revision of yum is in use, which is required to determine module backend.", "You can manually specify use_backend to tell the module whether to use the yum (yum3) or dnf (yum4) backend})"), } ) # Now fall through to cleanup |
如代码所示,当执行yum未指定use_backend参数时,ansible会执行setup模块并根据ansible_pkg_mgr来自动判断yum的版本,获取不到则会报错,继续看下该参数的获取过程,找到pkg_mgr.py,关键代码如下: ansible/module_utils/facts/system/pkg_mgr.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
|
def collect(self, module=None, collected_facts=None): facts_dict = {} collected_facts = collected_facts or {} pkg_mgr_name = 'unknown' for pkg in PKG_MGRS: if os.path.exists(pkg['path']): pkg_mgr_name = pkg['name'] # Handle distro family defaults when more than one package manager is # installed or available to the distro, the ansible_fact entry should be # the default package manager officially supported by the distro. if collected_facts['ansible_os_family'] == "RedHat": pkg_mgr_name = self._check_rh_versions(pkg_mgr_name, collected_facts) ... ... def _check_rh_versions(self, pkg_mgr_name, collected_facts): if collected_facts['ansible_distribution'] == 'Fedora': if os.path.exists('/run/ostree-booted'): return "atomic_container" try: if int(collected_facts['ansible_distribution_major_version']) < 23: for yum in [pkg_mgr for pkg_mgr in PKG_MGRS if pkg_mgr['name'] == 'yum']: if os.path.exists(yum['path']): pkg_mgr_name = 'yum' break else: for dnf in [pkg_mgr for pkg_mgr in PKG_MGRS if pkg_mgr['name'] == 'dnf']: if os.path.exists(dnf['path']): pkg_mgr_name = 'dnf' break except ValueError: # If there's some new magical Fedora version in the future, # just default to dnf pkg_mgr_name = 'dnf' elif collected_facts['ansible_distribution'] == 'Amazon': pkg_mgr_name = 'yum' else: # If it's not one of the above and it's Red Hat family of distros, assume # RHEL or a clone. For versions of RHEL < 8 that Ansible supports, the # vendor supported official package manager is 'yum' and in RHEL 8+ # (as far as we know at the time of this writing) it is 'dnf'. # If anyone wants to force a non-official package manager then they # can define a provider to either the package or yum action plugins. if int(collected_facts['ansible_distribution_major_version']) < 8: pkg_mgr_name = 'yum' else: pkg_mgr_name = 'dnf' return pkg_mgr_name |
以上代码可以看到当判断系统为红帽系,则会继续判断系统版本信息,当主版本号<8则使用yum,否则使用dnf,这里我们初步判断为麒麟对系统做了某些修改导致无法获取到主版本号。先执行setup获取发行版代号验证下是否执行了上述逻辑:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
|
# ansible -i hosts node01 -m setup -a "filter=ansible_distribution" node01 | SUCCESS => { "ansible_facts": { "ansible_distribution": "RedHat", "discovered_interpreter_python": "/usr/bin/python" }, "changed": false } # ansible -i hosts node01 -m setup -a "filter=ansible_distribution_major_version" node01 | SUCCESS => { "ansible_facts": { "ansible_distribution_major_version": "V10", "discovered_interpreter_python": "/usr/bin/python" }, "changed": false } |
通过setup模块的输出结果可看到系统是判断为redhat发行版,但是通过ansible_distribution_major_version获取到的发行版主版本号为V10, 而和上面判断yum版本的代码关联起来看就会发现问题所在,int(collected_facts[‘ansible_distribution_major_version’]) < 8 中,ansible_distribution_major_version 变量在其初始化的代码中对应为为distribution_version.split(‘.’)[:2][0]的取值,而当系统中获取到的值是V7Update6时,该显然无法满足转换为int的要求。接下来看下V10这个关键字的定义位置,根据经验系统版本相关信息是在/etc/os-release中, 这里可以看到VERSION_ID的值被定义为V10,而系统原生发行版中该值是7,我们来看下os-release中对VERSION_ID参数的说明:
|
man os-release ... ... VERSION_ID= A lower-case string (mostly numeric, no spaces or other characters outside of 0-9, a-z, ".", "_" and "-") identifying the operating system version, excluding any OS name information or release code name, and suitable for processing by scripts or usage in generated filenames. This field is optional. Example: "VERSION_ID=17" or "VERSION_ID=11.04". ... ... |
根据man文档中的描述,VERSION_ID取值范围为全小写,通常为数值型,不应有空格或其他特殊字符,可包含的字符为0-9a-z._-,那么这里可以看到两个问题, 第一个问题是kylin的VERSION_ID不符合此描述,包含了大写字符,第二个问题是VERSION_ID可以包含a-z字母,但是通常是数值如17,11.04等。 但由于常见发行版都将此处处理为数值型,就导致ansible按照此约定俗成固化了其获取系统版本的方法,并试图将一个字符串转换为int,不能满足当VERSION_ID包含了字母的情况。 验证结论 通过以上判断看到VERSION_ID是导致该问题现象的关键,那么我们可以尝试修改一下该参数值,再执行setup看看是否可以正常工作: 手动将VERSION_ID从V10改为7,测一下: 改为10, 再测一下: 可以看到,修改os-release中VERSION_ID为纯数值后,setup就可以正常判断到系统版本,进而可以获取到正确的yum版本了。 通过以上可以看到操作系统中即便是一些不起眼的细枝末节,处理不当也可能引发”连锁反应”。