我正在尝试使用run_as_user功能在气流中为我们的DAG和我们面临一些问题。有什么帮助或建议吗?
DAG Code:from datetime import datetime, timedelta from airflow import DAG from airflow.operators.bash_operator import BashOperator current_time = datetime.now() - timedelta(days=1) default_args = { 'start_date': datetime.strptime(current_time.strftime('%Y-%m-%d %H:%M:%S'),'%Y-%m-%d %H:%M:%S'), 'run_as_user': 'airflowaduser', 'execution_timeout': timedelta(minutes=5) dag = DAG('test_run-as_user', default_args=default_args,description='Run hive Query DAG', schedule_interval='0 * * * *',) hive_ex = BashOperator( task_id='hive-ex', bash_command='whoami', dag=dag )
我有气流添加到sudoers,它可以切换到气流用户,没有密码从Linux外壳。
airflow ALL=(ALL) NOPASSWD: ALL
运行DAG时,下面的错误详细信息:
*** Reading local file: /home/airflow/logs/test_run-as_user/hive-ex/2020-06-09T16:00:00+00:00/1.log [2020-06-09 17:00:04,602] {taskinstance.py:620} INFO - Dependencies all met for <TaskInstance: test_run-as_user.hive-ex 2020-06-09T16:00:00+00:00 [queued]> [2020-06-09 17:00:04,613] {taskinstance.py:620} INFO - Dependencies all met for <TaskInstance: test_run-as_user.hive-ex 2020-06-09T16:00:00+00:00 [queued]> [2020-06-09 17:00:04,613] {taskinstance.py:838} INFO - -------------------------------------------------------------------------------- [2020-06-09 17:00:04,613] {taskinstance.py:839} INFO - Starting attempt 1 of 1 [2020-06-09 17:00:04,613] {taskinstance.py:840} INFO - -------------------------------------------------------------------------------- [2020-06-09 17:00:04,651] {taskinstance.py:859} INFO - Executing <Task(BashOperator): hive-ex> on 2020-06-09T16:00:00+00:00 [2020-06-09 17:00:04,651] {base_task_runner.py:133} INFO - Running: ['sudo', '-E', '-H', '-u', 'airflowaduser', 'airflow', 'run', 'test_run-as_user', 'hive-ex', '2020-06-09T16:00:00+00:00', '--job_id', '2314', '--pool', 'default_pool', '--raw', '-sd', 'DAGS_FOLDER/test_run-as_user/testscript.py', '--cfg_path', '/tmp/tmpbinlgw54'] [2020-06-09 17:00:04,664] {base_task_runner.py:115} INFO - Job 2314: Subtask hive-ex sudo: airflow: command not found [2020-06-09 17:00:09,576] {logging_mixin.py:95} INFO - [[34m2020-06-09 17:00:09,575[0m] {[34mlocal_task_job.py:[0m105} INFO[0m - Task exited with return code 1[0m
我们的气流在虚拟环境中运行。
【玩转 GPU】有奖征文
精美礼品等你拿!
在虚拟环境中运行气流时,只有用户“气流”被配置为运行 airflow 命令。如果希望以另一个用户的身份运行,则需要将主目录设置为与气流用户( /home/airflow )相同的目录,并使其属于 0 组。请参阅https://airflow.apache.org/docs/docker-stack/entrypoint.html#allowing-arbitrary-user-to-run-the-container
airflow
/home/airflow
0
此外, run_as_user 特性调用 sudo ,只允许使用安全路径。 airflow 命令的位置不是安全路径的一部分,但可以添加到sudoers文件中。您可以使用 whereis airflow 检查气流目录在哪里,在我的容器中它是 /home/airflow/.local/bin 。
run_as_user
sudo
whereis airflow
/home/airflow/.local/bin
为了解决这个问题,我需要在我的Dockerfile中添加4行:
RUN useradd -u [airflowaduser UID] -g 0 -d /home/airflow kettle && \ # create airflowaduser