Skip to content
Projects
Groups
Snippets
Help
Loading...
Help
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
重
重点类信息提取
Project
Project
Details
Activity
Releases
Cycle Analytics
Repository
Repository
Files
Commits
Branches
Tags
Contributors
Graph
Compare
Charts
Issues
0
Issues
0
List
Boards
Labels
Milestones
Merge Requests
0
Merge Requests
0
CI / CD
CI / CD
Pipelines
Jobs
Schedules
Charts
Wiki
Wiki
Snippets
Snippets
Members
Members
Collapse sidebar
Close sidebar
Activity
Graph
Charts
Create a new issue
Jobs
Commits
Issue Boards
Open sidebar
ZGC_INDEX
重点类信息提取
Commits
ac0f7adb
Commit
ac0f7adb
authored
Apr 06, 2021
by
Jialin
Browse files
Options
Browse Files
Download
Email Patches
Plain Diff
几乎无改动
parent
8b34e230
Changes
6
Hide whitespace changes
Inline
Side-by-side
Showing
6 changed files
with
9 additions
and
9 deletions
+9
-9
产品品牌分析.py
公共代码/产品品牌分析.py
+2
-2
产品类别分析.py
公共代码/产品类别分析.py
+6
-6
产品重复型号分析.py
公共代码/产品重复型号分析.py
+1
-1
激光打印机产品品牌分析.xlsx
公共代码/激光打印机产品品牌分析.xlsx
+0
-0
激光打印机产品类别分析.xlsx
公共代码/激光打印机产品类别分析.xlsx
+0
-0
激光打印机产品重复型号分析.xlsx
公共代码/激光打印机产品重复型号分析.xlsx
+0
-0
No files found.
公共代码/产品品牌分析.py
View file @
ac0f7adb
...
...
@@ -8,7 +8,7 @@ import re
import
xlsxwriter
def
brand_washing
(
filepath
,
thre
=
0.5
,
inner_thre
=
0.
5
,
a
=
1
,
sheet_name
=
0
):
def
brand_washing
(
filepath
,
thre
=
0.5
,
inner_thre
=
0.
8
,
a
=
1
,
sheet_name
=
0
):
# filepath:文件路径,thre为两个品牌下型号重合率阈值,inner_thre为两个品牌下某条型号内关键词重合率阈值,a为权重调整,sheet_name为表单名
df
=
pd
.
read_excel
(
filepath
,
sheet_name
=
sheet_name
,
converters
=
{
'产品编码'
:
str
})
# 处理缺失值
...
...
@@ -244,6 +244,6 @@ def brand_washing(filepath,thre=0.5,inner_thre=0.5,a=1,sheet_name=0):
workbook
.
close
()
if
__name__
==
'__main__'
:
filepath
=
'E:
\\
ZDZC
\\
扫描仪
参数确认.xlsx'
filepath
=
'E:
\\
ZDZC
\\
激光打印机
参数确认.xlsx'
brand_washing
(
filepath
)
公共代码/产品类别分析.py
View file @
ac0f7adb
...
...
@@ -227,10 +227,10 @@ def class_washing(category, filepath, c_list,a=0.02, b=0.01):
if
__name__
==
'__main__'
:
#
category='激光打印机'
#
filepath="E:\\ZDZC\\激光打印机参数确认.xlsx"
#
c_list=[6,7,-4,-3]
category
=
'扫描仪'
filepath
=
"E:
\\
ZDZC
\\
扫描仪参数确认.xlsx"
c_list
=
[
7
,
8
,
9
]
category
=
'激光打印机'
filepath
=
"E:
\\
ZDZC
\\
激光打印机参数确认.xlsx"
c_list
=
[
6
,
7
,
-
4
,
-
3
]
#
category = '扫描仪'
#
filepath="E:\\ZDZC\\扫描仪参数确认.xlsx"
#
c_list=[7,8,9]
class_washing
(
category
,
filepath
,
c_list
)
公共代码/产品重复型号分析.py
View file @
ac0f7adb
...
...
@@ -149,5 +149,5 @@ def product_washing(filepath, thre=1, a=0):
if
__name__
==
'__main__'
:
filepath
=
"E:
\\
ZDZC
\\
扫描仪
参数确认.xlsx"
filepath
=
"E:
\\
ZDZC
\\
激光打印机
参数确认.xlsx"
product_washing
(
filepath
)
公共代码/
brand_filter
.xlsx
→
公共代码/
激光打印机产品品牌分析
.xlsx
View file @
ac0f7adb
No preview for this file type
公共代码/激光打印机产品类别分析.xlsx
0 → 100644
View file @
ac0f7adb
File added
公共代码/
product_filter
.xlsx
→
公共代码/
激光打印机产品重复型号分析
.xlsx
View file @
ac0f7adb
No preview for this file type
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment