从PowerShell脚本传递Json到Python

1 人关注

我正在尝试使用PowerShell脚本(.ps1)将一个Json字符串传递给Python脚本,以实现这一任务的自动化,但没有成功。

spark-submit `
--driver-memory 8g `
--master local[*] `
--conf spark.driver.bindAddress=127.0.0.1 `
--packages mysql:mysql-connector-java:6.0.6,org.elasticsearch:elasticsearch-spark-20_2.11:7.0.0 `
--py-files build/dependencies.zip build/main.py `
$param

$param='{ \"\"job_start\"\": \"\"jdbc:mysql://127.0.0.1:3307/test\"\"}'工作正常时,Python会收到一个有效的JSON字符串并正确解析。

当我使用字符&$param='{ \"\"job_start\"\": \"\"jdbc:mysql://127.0.0.1:3307/test&serverTimezone=UTC&autoReconnect=true&useSSL=false\"\"}'时,字符串被打印成{ "job_start": \jdbc:mysql://127.0.0.1:3307/test?,字符串的其余部分被重构为其他命令。

'serverTimezone' is not recognized as an internal or external command
'autoReconnect' is not recognized as an internal or external command
'useSSL' is not recognized as an internal or external command

替换代码6】是为了保持Python脚本中的双引号,不知道为什么需要两个转义的双引号。

现在我遇到了!字符的问题,即使用^ or \,我也无法摆脱这个字符。

# Only "" doesn't work
$param='{\"\"job_start\"\": \"\"jdbc:mysql://127.0.0.1:3307/test^&serverTimezone=UTC\"\", \"\"password\"\": \"\"testpassword^!123\"\"}'
spark-submit.cmd `
--driver-memory 8g `
--master local[*] `
--conf spark.driver.bindAddress=127.0.0.1 `
--packages mysql:mysql-connector-java:6.0.6,org.elasticsearch:elasticsearch-spark-20_2.11:7.0.0 `
--py-files build/dependencies.zip build/main.py `
$param
# OUTPUT: misses the ! character
{"job_start": "jdbc:mysql://127.0.0.1:3307/test&serverTimezone=UTC", "password": "testpassword123"}

谢谢大家。

3 个评论
我想知道spark-submit是否有另一种背景,即一个简单的python脚本?
了解一下 spark-submit 向Python传递参数的方式是否真的存在问题,或者这个问题是否是你的场景/环境所特有的,这将是一件好事。另外,在你的更新中,你提到了 output :谁生产这种产出?
@mklement0 是的,我想弄清楚````spark-submit```发生了什么,因为直接传递给Python的方式和你的解释一样。关于输出,在我的spark脚本的开头有一个打印函数
python
powershell
cmd
Bruno Bernardes
Bruno Bernardes
发布于 2020-05-06
2 个回答
mklement0
mklement0
发布于 2020-05-06
已采纳
0 人赞同

tl;dr

Note: The following does not 解决OP的具体问题(其原因尚不清楚),但希望包含有普遍意义的信息。

# Use "" to escape " and - in case of delayed expansion - ^! to escape !
$param = '{ ""job_start"": ""jdbc:mysql://127.0.0.1:3307/test&serverTimezone=UTC&more^!"" }'
  • There are high-profile utilities (CLIs) such as az (Azure) that are Python-based, but on Windows use an 辅助批处理文件 as the executable that simply relays arguments到一个Python脚本。
  • Use Get-Command az, for instance, to discover an executable's full file name; 批量文件, which are processed by cmd.exe, the legacy command processor, have a filename extension of either .cmd or .bat
  • To prevent c所有s to such a batch file from breaking, double quotes embedded in arguments passed from PowerShell must be 逃脱 as ""
  • Addition所有y, but only if setlocal enabledelayedexpansion is in effect in a given target batch file or如果你的计算机被配置为使用延迟的扩展默认情况下, for 所有 批量文件:
  • ! characters must be 逃脱 as ^!, which, however, is only effective if cmd.exe considers the ! part of a double-quoted string.
  • In an ideal world, passing JSON text such as '{ "foo": "bar" }' to an external program would work as-is, but due to PowerShell's broken handling of embedded double quotes, that is not enough, and the " chars. must addition所有y be 逃脱, for the target program, either as \" (which most programs support), or, in the case of cmd.exe (see below), as "", which Python fortunately recognizes too: '{ ""foo"": ""bar"" }'
  • 替换代码3】中的参数传递和转义的限制批量文件:

  • 听起来好像spark-submit是一个辅助批处理文件(.cmd.bat),传递的参数是through到一个Python脚本。

  • 问题是如果你使用\"来转义嵌入的"cmd.exe不会将它们识别为逃脱导致它考虑&字符。unquoted,因此它们被解释为外壳元字符,即作为具有特殊句法功能的字符(在这里是指命令排序)。

  • 此外,和only if setlocal enabledelayedexpansion is in effect在一个给定的批处理文件中,任何替换代码8】参数中的字词需要额外的处理。

  • 如果cmd.exe认为!是属于一个unquoted论证,你根本无法逃避!

  • 在一个quoted论据(在cmd.exe中无一例外地意味着"..."),你必须逃避一个字面的!作为^!

  • 请注意,这一要求是inverse的所有other metacharacters must be 逃脱 (which require ^ when unquoted, but not inside "...").

  • 不幸的后果是you need to know the 实施细节的目标批处理文件--无论它是否使用setlocal enabledelayedexpansion--以便正确制定你的论据。

  • The 如果你的计算机被配置为使用延迟扩展,同样适用。默认情况下, for 所有 批量文件 (and interactively), which is neither common nor advisable. To test if a given computer is configured that way, check the output from the following command for DelayedExpansion : 1: if there's no output at 所有, delayed expansion is OFF; if there's 1 or 2 outputs, delayed expansion is ON 默认情况下 if the first or only output reports DelayedExpansion : 1.

  • Since you're technic所有y c所有ing a batch file, use "" to escape literal " chars. inside your single-quoted ('...') PowerShell string.

  • 如果你知道目标批处理文件使用的是setlocal enabledelayedexpansionor如果你的计算机被配置为使用延迟的扩展默认情况下, escape ! characters as ^!

  • Note that this is only effective if cmd.exe considers the ! part of a double-quoted string.
  • Therefore (note that I've extended the URL to include a token with !, meant to be passed through liter所有y as suffix more!):

    $param = '{ ""job_start"": ""jdbc:mysql://127.0.0.1:3307/test&serverTimezone=UTC&more^!"" }'
    

    如果你需要逃离一个existing JSON string programmatic所有y:

    # Unescaped JSON string, which in an ideal world you'd be able
    # to pass as-is.
    $param = '{ "job_start": "jdbc:mysql://127.0.0.1:3307/test&serverTimezone=UTC&more!" }'
    # Escape the " chars.
    $param = $param -replace '"', '""'
    # If needed, also escape the ! chars.
    $param = $param -replace '!', '^!'
    

    归根结底,这两个问题都应该得到解决在源头上- 但这是非常不可能的,因为它将破坏后向兼容.

    With respect to PowerShell, this GitHub issue包含背景故事、技术细节、隐藏问题的强大包装功能,以及关于如何至少在选择的基础上解决问题的讨论。