pkgsrc-Changes archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Old Index]
CVS commit: pkgsrc/textproc/sentencepiece
Module Name: pkgsrc
Committed By: wiz
Date: Mon Mar 13 14:17:12 UTC 2023
Added Files:
pkgsrc/textproc/sentencepiece: DESCR Makefile Makefile.common PLIST
buildlink3.mk distinfo
Log Message:
textproc/sentencepiece: import sentencepiece-0.1.97
SentencePiece is an unsupervised text tokenizer and detokenizer
mainly for Neural Network-based text generation systems where the
vocabulary size is predetermined prior to the neural model training.
SentencePiece implements subword units (e.g., byte-pair-encoding
(BPE)) and unigram language model with the extension of direct
training from raw sentences. SentencePiece allows us to make a
purely end-to-end system that does not depend on language-specific
pre/postprocessing.
To generate a diff of this commit:
cvs rdiff -u -r0 -r1.1 pkgsrc/textproc/sentencepiece/DESCR \
pkgsrc/textproc/sentencepiece/Makefile \
pkgsrc/textproc/sentencepiece/Makefile.common \
pkgsrc/textproc/sentencepiece/PLIST \
pkgsrc/textproc/sentencepiece/buildlink3.mk \
pkgsrc/textproc/sentencepiece/distinfo
Please note that diffs are not public domain; they are subject to the
copyright notices on the relevant files.
Added files:
Index: pkgsrc/textproc/sentencepiece/DESCR
diff -u /dev/null pkgsrc/textproc/sentencepiece/DESCR:1.1
--- /dev/null Mon Mar 13 14:17:12 2023
+++ pkgsrc/textproc/sentencepiece/DESCR Mon Mar 13 14:17:12 2023
@@ -0,0 +1,8 @@
+SentencePiece is an unsupervised text tokenizer and detokenizer
+mainly for Neural Network-based text generation systems where the
+vocabulary size is predetermined prior to the neural model training.
+SentencePiece implements subword units (e.g., byte-pair-encoding
+(BPE)) and unigram language model with the extension of direct
+training from raw sentences. SentencePiece allows us to make a
+purely end-to-end system that does not depend on language-specific
+pre/postprocessing.
Index: pkgsrc/textproc/sentencepiece/Makefile
diff -u /dev/null pkgsrc/textproc/sentencepiece/Makefile:1.1
--- /dev/null Mon Mar 13 14:17:12 2023
+++ pkgsrc/textproc/sentencepiece/Makefile Mon Mar 13 14:17:12 2023
@@ -0,0 +1,8 @@
+# $NetBSD: Makefile,v 1.1 2023/03/13 14:17:12 wiz Exp $
+
+PKGCONFIG_OVERRIDE+= sentencepiece.pc.in
+
+.include "Makefile.common"
+
+.include "../../devel/cmake/build.mk"
+.include "../../mk/bsd.pkg.mk"
Index: pkgsrc/textproc/sentencepiece/Makefile.common
diff -u /dev/null pkgsrc/textproc/sentencepiece/Makefile.common:1.1
--- /dev/null Mon Mar 13 14:17:12 2023
+++ pkgsrc/textproc/sentencepiece/Makefile.common Mon Mar 13 14:17:12 2023
@@ -0,0 +1,16 @@
+# $NetBSD: Makefile.common,v 1.1 2023/03/13 14:17:12 wiz Exp $
+#
+# used by textproc/sentencepiece/Makefile
+# used by textproc/py-sentencepiece/Makefile
+
+DISTNAME= sentencepiece-0.1.97
+CATEGORIES= textproc
+MASTER_SITES= ${MASTER_SITE_GITHUB:=google/}
+GITHUB_TAG= v${PKGVERSION_NOREV}
+
+MAINTAINER= pkgsrc-users%NetBSD.org@localhost
+HOMEPAGE= https://github.com/google/sentencepiece/
+COMMENT= Unsupervised text tokenizer for Neural Network-based text generation
+LICENSE= apache-2.0
+
+USE_LANGUAGES= c c++17
Index: pkgsrc/textproc/sentencepiece/PLIST
diff -u /dev/null pkgsrc/textproc/sentencepiece/PLIST:1.1
--- /dev/null Mon Mar 13 14:17:12 2023
+++ pkgsrc/textproc/sentencepiece/PLIST Mon Mar 13 14:17:12 2023
@@ -0,0 +1,17 @@
+@comment $NetBSD: PLIST,v 1.1 2023/03/13 14:17:12 wiz Exp $
+bin/spm_decode
+bin/spm_encode
+bin/spm_export_vocab
+bin/spm_normalize
+bin/spm_train
+include/sentencepiece_processor.h
+include/sentencepiece_trainer.h
+lib/libsentencepiece.a
+lib/libsentencepiece.so
+lib/libsentencepiece.so.0
+lib/libsentencepiece.so.0.0.0
+lib/libsentencepiece_train.a
+lib/libsentencepiece_train.so
+lib/libsentencepiece_train.so.0
+lib/libsentencepiece_train.so.0.0.0
+lib/pkgconfig/sentencepiece.pc
Index: pkgsrc/textproc/sentencepiece/buildlink3.mk
diff -u /dev/null pkgsrc/textproc/sentencepiece/buildlink3.mk:1.1
--- /dev/null Mon Mar 13 14:17:12 2023
+++ pkgsrc/textproc/sentencepiece/buildlink3.mk Mon Mar 13 14:17:12 2023
@@ -0,0 +1,12 @@
+# $NetBSD: buildlink3.mk,v 1.1 2023/03/13 14:17:12 wiz Exp $
+
+BUILDLINK_TREE+= sentencepiece
+
+.if !defined(SENTENCEPIECE_BUILDLINK3_MK)
+SENTENCEPIECE_BUILDLINK3_MK:=
+
+BUILDLINK_API_DEPENDS.sentencepiece+= sentencepiece>=0.1.97
+BUILDLINK_PKGSRCDIR.sentencepiece?= ../../textproc/sentencepiece
+.endif # SENTENCEPIECE_BUILDLINK3_MK
+
+BUILDLINK_TREE+= -sentencepiece
Index: pkgsrc/textproc/sentencepiece/distinfo
diff -u /dev/null pkgsrc/textproc/sentencepiece/distinfo:1.1
--- /dev/null Mon Mar 13 14:17:12 2023
+++ pkgsrc/textproc/sentencepiece/distinfo Mon Mar 13 14:17:12 2023
@@ -0,0 +1,5 @@
+$NetBSD: distinfo,v 1.1 2023/03/13 14:17:12 wiz Exp $
+
+BLAKE2s (sentencepiece-0.1.97.tar.gz) = 969788b6d87e8c992f6df4349f984fb2d6e80f978d4007127174222ec7fcb3ab
+SHA512 (sentencepiece-0.1.97.tar.gz) = 4c35488e3661e45be677b04299c0d0b1f0d46421098f0b1625a1bb5e7725d175dfd55328a5a7bbf88badeb03c2ba087aef942b0d7520a29f6bf34eae211a99eb
+Size (sentencepiece-0.1.97.tar.gz) = 11945436 bytes
Home |
Main Index |
Thread Index |
Old Index